I have seen on rare ocassion when Google put itself into a loop. It was while trying to access the admin dashboard. I kept getting redirected to their auth page. If CAS would have been infront, I'm sure I'd have seen a similar looping. In my case, I couldn't anything to resolve. Checking the system status there was an issue that once resolved a few hours later it worked fine for me.
Another thought is that perhaps they have a NoScript style plugin that is/was blocking Google cookies. 3000+ loops was what it took for the user to figure out how to allow/permit the cookie. :) On 12/12/14 7:40 AM, Sean Baker wrote: > Great additional information. I would add, however, that the end > description of how things are interacting with Google is not likely. > That is -- assuming you've configured Google the same way we have -- > Google would be unable to think that the login has succeeded while CAS > still has an opportunity to interact with the process. When I visit > CAS (with a TGT) while seeking to authenticate to Google, I am issued > an ST and forwarded to Google's Assertion Consumer Service (the '/acs' > in the URL) with a cryptographically-signed SAMLResponse. Google > validates that Response, and if it's happy that it's valid (multiple > things to check), it grants me a new session in Google. As such, I > think it unlikely that CAS is redirecting the user /back/ to itself, > as by the time that the Google session is created, it no longer has > the ability to interact with the user's routing. More likely > something to do with how the user is interacting with Google reacting > badly with your environment. > > > Sent from my iPhone. > > > > On 12/12/14, 10:10 AM, David A. Kovacic wrote: >> Let me also expand on this a bit: >> >> In the normal course of events seeing 2-3K/hr worth of STs being >> created is not unusual for us. Of those typically half (roughly >> 1500) are Google logins, the key difference being that they are all >> different users with very few repeats. During one of those "normal" >> hours even with that number of Google logins the heap usage stays >> fairly stable at about 500MB used out of 1000MB allocated. A typical >> pattern looks like: >> >> >> >> The graph shows things coming in and going out of the heap on a very >> regular basis as can be seen. >> >> When the "issue" triggers, the used heap appears to grow without ever >> dropping again (at least until the heap runs out of memory and the >> service needs to be restarted). There MAY be some GC process that >> would clean up the heap, but it is not running in a timely enough >> fashion to prevent the heap memory from being exhausted, at least if >> the "issue" generated sufficient logins to Google. >> >> I would speculate, given what we are seeing, that some trigger event >> (lost SAML response from Google maybe) causes Google to think the >> login succeeded, but the SSO server to think it failed, causing it >> generate a new ST, and another, and another, until something kicks in >> (some sort of timer?) and the cycle terminates, but several instances >> of some variable never get properly derefed and take a LONG time to >> get GCed. >> >> >> On 12/12/14 6:20 AM, David A. Kovacic wrote: >>> Exactly right. If we see 3000 STs created in an hour for a user, >>> all to Google, we are also seeing Google report 3000 successful >>> logins in the same time frame reported in the Google admin console >>> audit logs. As far as I can tell, whatever condition triggers this >>> (and it may be some form of malware being used to send spam through >>> us) gets credentials once (only one TGT is ever created) and then >>> somehow does >1000 logins in about an hour to Google without ever >>> logging back out of Google. As far as we can tell, there are no >>> errors in the process of logging in, but because the Google SAML >>> process seems to leave fairly large remnants of instances in the >>> heap, and those remnants are not being GCed in a timely fashion, we >>> run out of heap memory and the SSO server process locks up, taking >>> the other server with it. To summarize, it seems not to be the SAML >>> process itself, but the VOLUME of SAML processes in a very short >>> time that seems to cause the issue. >>> >>> >>> On 12/11/14 9:01 PM, Sean Baker wrote: >>>> Now that's interesting -- is that to say that when you see these >>>> rapidly-generated service tickets for particular users you're >>>> seeing them logging in as many times to Google as well? >>>> >>>> >>>> >>>> >>>> On 12/11/14, 14:17 PM, David A. Kovacic wrote: >>>>> Google seems to be accepting the assertions each time as we are >>>>> seeing the same number of logins in Google's audit logs as the >>>>> number of STs being created. I would expect that if there was >>>>> something wrong with assertion we would be receiving complaints >>>>> from the users. I am more inclined at this point to believe some >>>>> sort of crazy browser loop, but it's definitely not happening with >>>>> any consistency. >>>>> >>>>> We have tried contacting the two people we identified once we >>>>> started to get a handle on what the issue was, however neither has >>>>> responded. That's not terribly surprising given that we are in >>>>> our finals period here and requests for information go pretty much >>>>> ignored by students and faculty alike at this time. >>>>> >>>>> Dave >>>>> >>>>> On 12/10/14 8:14 PM, Sean Baker wrote: >>>>>> Your access logs should show the individual SAMLRequest's generated by >>>>>> Google; if it's rejecting your assertions in some automated way you >>>>>> should see a new SAMLRequest each time. If it's the same request over >>>>>> and over, one might infer a more local issue (not definitively mind you; >>>>>> just much more likely) [ehcache issue, browser configuration, etc.]. >>>>>> >>>>>> Has anyone talked with your end users who're triggering these events >>>>>> about what they experienced? >>>>>> >>>>>> On 12/10/14, 15:16 PM, David A. Kovacic wrote: >>>>>>> Does anyone know what I would need to do to be able to log the >>>>>>> actual SAML transactions? Is there any way to actually do >>>>>>> that? We have isolated this issue to only logins to Google and >>>>>>> only under certain conditions when something seems to start >>>>>>> looping and generating STs rapidly. We are trying to isolate >>>>>>> the conditions under which the loop starts. >>>>>>> >>>>>>> It would be helpful to actually see the SAML transactions being >>>>>>> generated so we could begin to get a handle on what Google apps >>>>>>> is being referenced and if Google is returning any errors or not >>>>>>> (although Google claims valid logins). >>>>>>> >>>>>>> >>>>>>> On 12/6/14 9:11 AM, Marvin Addison wrote: >>>>>>>> >>>>>>>> Second, the massive number of STs are being created on >>>>>>>> only one server (we can tell by the host name in the logged >>>>>>>> ST) but the OTHER SERVER is where the memory is growing out >>>>>>>> of bounds. >>>>>>>> >>>>>>>> >>>>>>>> I'm still working through this thread, but I wanted to point >>>>>>>> out that the other is hurting likely because of load balancer >>>>>>>> session affinity. Recall that ticket validation is a >>>>>>>> back-channel call, and the network source differs from that of >>>>>>>> the user's browser. In our environment, services typically get >>>>>>>> stuck on one node causing hot spots. This is because the >>>>>>>> service is validating tickets frequently enough that the >>>>>>>> session affinity timeout never kicks in. >>>>>>>> >>>>>>>> M >>>>>>>> >>>>>>>> -- >>>>>>>> You are currently subscribed to cas-user@lists.jasig.org as: >>>>>>>> d...@case.edu >>>>>>>> To unsubscribe, change settings or access archives, see >>>>>>>> http://www.ja-sig.org/wiki/display/JSG/cas-user >>>>>>>> >>>>>>>> -- >>>>>>>> >>>>>>> -- >>>>>>> You are currently subscribed to cas-user@lists.jasig.org as: >>>>>>> sean.ba...@usuhs.edu >>>>>>> To unsubscribe, change settings or access archives, see >>>>>>> http://www.ja-sig.org/wiki/display/JSG/cas-user >>>>>> >>>>>> -- >>>>>> You are currently subscribed to cas-user@lists.jasig.org as: >>>>>> d...@case.edu >>>>>> To unsubscribe, change settings or access archives, see >>>>>> http://www.ja-sig.org/wiki/display/JSG/cas-user >>>>> -- >>>>> You are currently subscribed to cas-user@lists.jasig.org as: >>>>> sean.ba...@usuhs.edu >>>>> To unsubscribe, change settings or access archives, see >>>>> http://www.ja-sig.org/wiki/display/JSG/cas-user >>>> >>>> -- >>>> You are currently subscribed to cas-user@lists.jasig.org as: d...@case.edu >>>> To unsubscribe, change settings or access archives, see >>>> http://www.ja-sig.org/wiki/display/JSG/cas-user >>> -- >>> You are currently subscribed to cas-user@lists.jasig.org as: d...@case.edu >>> To unsubscribe, change settings or access archives, see >>> http://www.ja-sig.org/wiki/display/JSG/cas-user >> -- >> You are currently subscribed to cas-user@lists.jasig.org as: >> sean.ba...@usuhs.edu >> To unsubscribe, change settings or access archives, see >> http://www.ja-sig.org/wiki/display/JSG/cas-user > > -- > You are currently subscribed to cas-user@lists.jasig.org as: > jgas...@unicon.net > To unsubscribe, change settings or access archives, see > http://www.ja-sig.org/wiki/display/JSG/cas-user -- You are currently subscribed to cas-user@lists.jasig.org as: arch...@mail-archive.com To unsubscribe, change settings or access archives, see http://www.ja-sig.org/wiki/display/JSG/cas-user