Re: [cisco-voip] MRA DR / Resilience

ROZA, Ariel Mon, 18 Jan 2021 10:59:58 -0800

I just reread the release notes, and it includes the case where CUCM is down.


De: cisco-voip <cisco-voip-boun...@puck.nether.net> En nombre de ROZA, Ariel
Enviado el: lunes, 18 de enero de 2021 15:53
Para: NateCCIE <natec...@gmail.com>; Pawlowski, Adam <aj...@buffalo.edu>
CC: cisco-voip@puck.nether.net
Asunto: Re: [cisco-voip] MRA DR / Resilience

But will this include the scenario were one of the CUCMs  is down? Don´t see 
explicitly in the notes…

De: cisco-voip 
<cisco-voip-boun...@puck.nether.net<mailto:cisco-voip-boun...@puck.nether.net>> 
En nombre de NateCCIE
Enviado el: miércoles, 13 de enero de 2021 10:56
Para: Pawlowski, Adam <aj...@buffalo.edu<mailto:aj...@buffalo.edu>>
CC: cisco-voip@puck.nether.net<mailto:cisco-voip@puck.nether.net>
Asunto: Re: [cisco-voip] MRA DR / Resilience

SIP Registration Failover for Cisco Jabber - MRA Deployments

https://www.cisco.com/c/dam/en/us/td/docs/voice_ip_comm/expressway/release_note/Cisco-Expressway-Release-Note-X12-7.pdf#page16<https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.cisco.com%2Fc%2Fdam%2Fen%2Fus%2Ftd%2Fdocs%2Fvoice_ip_comm%2Fexpressway%2Frelease_note%2FCisco-Expressway-Release-Note-X12-7.pdf%23page16&data=04%7C01%7Cariel.roza%40la.logicalis.com%7C7102b260f7c543fc5d8c08d8bbe27944%7C2e3290cb8d404058abe502c4f58b87e3%7C0%7C0%7C637465928819010016%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=2hhQwkNYTqiqc6wDDUwV%2B%2BZUcfpKc%2Bpg3otGhRX5ePw%3D&reserved=0>

This is new in x12.7
Sent from my iPhone

On Jan 13, 2021, at 6:10 AM, Pawlowski, Adam 
<aj...@buffalo.edu<mailto:aj...@buffalo.edu>> wrote:

Hey all,

I’m playing in this scenario now and trying to figure out what parts of the 
solution work, and which do not, in a DR “site failover’ kind of scenario with 
regard to MRA.

I understand the documentation prescribes there’s no failover for voice and 
video, but I think that failover is different than the one I’m describing here.

I know I can take Expressway C and Expressway E nodes out of the cluster at 
will, and things will heal over time once the Jabber clients catch up.

I can take a Unity Connection guest down, and it should work, though the Jetty 
service certainly has load limits. I don’t think I’m hitting those here.

I can take an IM&P node down, and, with the exception of pChat services (DB was 
not deployed HA and merge job just seems to fail but that’s another 
investigation), clients will eventually fail over and recover.

Today, we have half the C  cluster, half the E cluster, and one of two CUC 
nodes down. All IMP are up. One UCM subscriber is down, and things have been 
going poorly. Jabber customers keep getting punted from the client with “Your 
session has expired” randomly. The Jabber log looks like this token has 
expired, but, doesn’t provide enough debugging to know why. It’s possible that 
the Expressway E is fronting this message, since I understand it sits between 
Jabber and the rest of the infrastructure for oAuth, and Jabber does not talk 
to the UCM/CUC directly.

When we did not have SSO, the worst thing we had to do is make sure that the 
Jabber client’s device pool had an active UCM as the primary in the CMGroup, as 
they wouldn’t register properly without that, but, those UCMs are up.

Does anyone know what might be going on here?

My best guess is that the Expressway isn’t intelligent enough to mark a UCM out 
of service when unreachable (or CUC server for that matter) and it is trying to 
refresh a customer’s token against a server that isn’t up. When this times out, 
instead of trying another it is telling Jabber the refresh token is expired. If 
this is the case, there’s no cluster resilience with Jabber, if any nodes are 
down then things are going to be intermittent.

Why does Jabber sometimes choose to pop the dialog asking for a new session, 
and sometimes it just kicks the customer out of the client requiring a new sign 
in? I see a bug that suggests enabling LegacyOAuthSignout parameter, but, it 
doesn’t explain what effect that’s going to have on the client.

Basically, this is just a test but I am trying to learn from it, and would 
appreciate any thoughts/experiences. If it is the Expressway cluster, then 
there’s no way around this as far as I can tell. Marking a UCM inactive with 
xAPI doesn’t work, it just gets pushed back to active.

Any comments appreciated.

Best,

Adam Pawlowski
SUNYAB NCS


_______________________________________________
cisco-voip mailing list
cisco-voip@puck.nether.net<mailto:cisco-voip@puck.nether.net>
https://puck.nether.net/mailman/listinfo/cisco-voip<https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpuck.nether.net%2Fmailman%2Flistinfo%2Fcisco-voip&data=04%7C01%7Cariel.roza%40la.logicalis.com%7C7102b260f7c543fc5d8c08d8bbe27944%7C2e3290cb8d404058abe502c4f58b87e3%7C0%7C0%7C637465928819010016%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=BppcmVisIn5sIsTs58PMMqmKAtYeB3M0G9HQF7LRt%2Fw%3D&reserved=0>

_______________________________________________
cisco-voip mailing list
cisco-voip@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-voip

Re: [cisco-voip] MRA DR / Resilience

Reply via email to