I have seen on rare ocassion when Google put itself into a loop. It was
while trying to access the admin dashboard. I kept getting redirected to
their auth page. If CAS would have been infront, I'm sure I'd have seen
a similar looping. In my case, I couldn't anything to resolve. Checking
the system status there was an issue that once resolved a few hours
later it worked fine for me.

Another thought is that perhaps they have a NoScript style plugin that
is/was blocking Google cookies. 3000+ loops was what it took for the
user to figure out how to allow/permit the cookie. :)

On 12/12/14 7:40 AM, Sean Baker wrote:
> Great additional information.  I would add, however, that the end
> description of how things are interacting with Google is not likely. 
> That is -- assuming you've configured Google the same way we have --
> Google would be unable to think that the login has succeeded while CAS
> still has an opportunity to interact with the process.  When I visit
> CAS (with a TGT) while seeking to authenticate to Google, I am issued
> an ST and forwarded to Google's Assertion Consumer Service (the '/acs'
> in the URL) with a cryptographically-signed SAMLResponse.  Google
> validates that Response, and if it's happy that it's valid (multiple
> things to check), it grants me a new session in Google.  As such, I
> think it unlikely that CAS is redirecting the user /back/ to itself,
> as by the time that the Google session is created, it no longer has
> the ability to interact with the user's routing.  More likely
> something to do with how the user is interacting with Google reacting
> badly with your environment.
>
>
> Sent from my iPhone.
>
>
>
> On 12/12/14, 10:10 AM, David A. Kovacic wrote:
>> Let me also expand on this a bit:
>>
>> In the normal course of events seeing 2-3K/hr worth of STs being
>> created is not unusual for us.  Of those typically half (roughly
>> 1500) are Google logins, the key difference being that they are all
>> different users with very few repeats.  During one of those "normal"
>> hours even with that number of Google logins the heap usage stays
>> fairly stable at about 500MB used out of 1000MB allocated.  A typical
>> pattern looks like:
>>
>>
>>
>> The graph shows things coming in and going out of the heap on a very
>> regular basis as can be seen. 
>>
>> When the "issue" triggers, the used heap appears to grow without ever
>> dropping again (at least until the heap runs out of memory and the
>> service needs to be restarted).  There MAY be some GC process that
>> would clean up the heap, but it is not running in a timely enough
>> fashion to prevent the heap memory from being exhausted, at least if
>> the "issue" generated sufficient logins to Google.
>>
>> I would speculate, given what we are seeing, that some trigger event
>> (lost SAML response from Google maybe) causes Google to think the
>> login succeeded, but the SSO server to think it failed, causing it
>> generate a new ST, and another, and another, until something kicks in
>> (some sort of timer?) and the cycle terminates, but several instances
>> of some variable never get properly derefed and take a LONG time to
>> get GCed.
>>
>>
>> On 12/12/14 6:20 AM, David A. Kovacic wrote:
>>> Exactly right.  If we see 3000 STs created in an hour for a user,
>>> all to Google, we are also seeing Google report 3000 successful
>>> logins in the same time frame reported in the Google admin console
>>> audit logs.  As far as I can tell, whatever condition triggers this
>>> (and it may be some form of malware being used to send spam through
>>> us) gets credentials once (only one TGT is ever created) and then
>>> somehow does >1000 logins in about an hour to Google without ever
>>> logging back out of Google.  As far as we can tell, there are no
>>> errors in the process of logging in, but because the Google SAML
>>> process seems to leave fairly large remnants of instances in the
>>> heap, and those remnants are not being GCed in a timely fashion, we
>>> run out of heap memory and the SSO server process locks up, taking
>>> the other server with it.  To summarize, it seems not to be the SAML
>>> process itself, but the VOLUME of SAML processes in a very short
>>> time that seems to cause the issue.
>>>
>>>
>>> On 12/11/14 9:01 PM, Sean Baker wrote:
>>>> Now that's interesting -- is that to say that when you see these
>>>> rapidly-generated service tickets for particular users you're
>>>> seeing them logging in as many times to Google as well?
>>>>
>>>>
>>>>
>>>>
>>>> On 12/11/14, 14:17 PM, David A. Kovacic wrote:
>>>>> Google seems to be accepting the assertions each time as we are
>>>>> seeing the same number of logins in Google's audit logs as the
>>>>> number of STs being created.  I would expect that if there was
>>>>> something wrong with assertion we would be receiving complaints
>>>>> from the users.  I am more inclined at this point to believe some
>>>>> sort of crazy browser loop, but it's definitely not happening with
>>>>> any consistency. 
>>>>>
>>>>> We have tried contacting the two people we identified once we
>>>>> started to get a handle on what the issue was, however neither has
>>>>> responded.  That's not terribly surprising given that we are in
>>>>> our finals period here and requests for information go pretty much
>>>>> ignored by students and faculty alike at this time.
>>>>>
>>>>> Dave
>>>>>
>>>>> On 12/10/14 8:14 PM, Sean Baker wrote:
>>>>>> Your access logs should show the individual SAMLRequest's generated by 
>>>>>> Google; if it's rejecting your assertions in some automated way you 
>>>>>> should see a new SAMLRequest each time.  If it's the same request over 
>>>>>> and over, one might infer a more local issue (not definitively mind you; 
>>>>>> just much more likely) [ehcache issue, browser configuration, etc.].
>>>>>>
>>>>>> Has anyone talked with your end users who're triggering these events 
>>>>>> about what they experienced?
>>>>>>
>>>>>> On 12/10/14, 15:16 PM, David A. Kovacic wrote:
>>>>>>> Does anyone know what I would need to do to be able to log the
>>>>>>> actual SAML transactions?  Is there any way to actually do
>>>>>>> that?  We have isolated this issue to only logins to Google and
>>>>>>> only under certain conditions when something seems to start
>>>>>>> looping and generating STs rapidly.  We are trying to isolate
>>>>>>> the conditions under which the loop starts. 
>>>>>>>
>>>>>>> It would be helpful to actually see the SAML transactions being
>>>>>>> generated so we could begin to get a handle on what Google apps
>>>>>>> is being referenced and if Google is returning any errors or not
>>>>>>> (although Google claims valid logins).
>>>>>>>
>>>>>>>
>>>>>>> On 12/6/14 9:11 AM, Marvin Addison wrote:
>>>>>>>>
>>>>>>>>     Second, the massive number of  STs are being created on
>>>>>>>>     only one server (we can tell by the host name in the logged
>>>>>>>>     ST) but the OTHER SERVER is where the memory is growing out
>>>>>>>>     of bounds.
>>>>>>>>
>>>>>>>>
>>>>>>>> I'm still working through this thread, but I wanted to point
>>>>>>>> out that the other is hurting likely because of load balancer
>>>>>>>> session affinity. Recall that ticket validation is a
>>>>>>>> back-channel call, and the network source differs from that of
>>>>>>>> the user's browser. In our environment, services typically get
>>>>>>>> stuck on one node causing hot spots. This is because the
>>>>>>>> service is validating tickets frequently enough that the
>>>>>>>> session affinity timeout never kicks in.
>>>>>>>>
>>>>>>>> M
>>>>>>>>
>>>>>>>> -- 
>>>>>>>> You are currently subscribed to cas-user@lists.jasig.org as: 
>>>>>>>> d...@case.edu
>>>>>>>> To unsubscribe, change settings or access archives, see 
>>>>>>>> http://www.ja-sig.org/wiki/display/JSG/cas-user
>>>>>>>>
>>>>>>>> -- 
>>>>>>>>  
>>>>>>> -- 
>>>>>>> You are currently subscribed to cas-user@lists.jasig.org as: 
>>>>>>> sean.ba...@usuhs.edu
>>>>>>> To unsubscribe, change settings or access archives, see 
>>>>>>> http://www.ja-sig.org/wiki/display/JSG/cas-user
>>>>>>
>>>>>> -- 
>>>>>> You are currently subscribed to cas-user@lists.jasig.org as: 
>>>>>> d...@case.edu
>>>>>> To unsubscribe, change settings or access archives, see 
>>>>>> http://www.ja-sig.org/wiki/display/JSG/cas-user
>>>>> -- 
>>>>> You are currently subscribed to cas-user@lists.jasig.org as: 
>>>>> sean.ba...@usuhs.edu
>>>>> To unsubscribe, change settings or access archives, see 
>>>>> http://www.ja-sig.org/wiki/display/JSG/cas-user
>>>>
>>>> -- 
>>>> You are currently subscribed to cas-user@lists.jasig.org as: d...@case.edu
>>>> To unsubscribe, change settings or access archives, see 
>>>> http://www.ja-sig.org/wiki/display/JSG/cas-user
>>> -- 
>>> You are currently subscribed to cas-user@lists.jasig.org as: d...@case.edu
>>> To unsubscribe, change settings or access archives, see 
>>> http://www.ja-sig.org/wiki/display/JSG/cas-user
>> -- 
>> You are currently subscribed to cas-user@lists.jasig.org as: 
>> sean.ba...@usuhs.edu
>> To unsubscribe, change settings or access archives, see 
>> http://www.ja-sig.org/wiki/display/JSG/cas-user
>
> -- 
> You are currently subscribed to cas-user@lists.jasig.org as: 
> jgas...@unicon.net
> To unsubscribe, change settings or access archives, see 
> http://www.ja-sig.org/wiki/display/JSG/cas-user


-- 
You are currently subscribed to cas-user@lists.jasig.org as: 
arch...@mail-archive.com
To unsubscribe, change settings or access archives, see 
http://www.ja-sig.org/wiki/display/JSG/cas-user

Reply via email to