Good  news, all of our access points are up tonight.  Bad news, it was a
rough couple of days before we got to this point.  A lot of zoom calls with
Aruba engineers.

Early on in the debug process, I noticed logs from the access points
indicating lost packets, duplicate packets, and packets out of sequence.
Tunnels between the ap's and the controllers weren't stable.  Ap's were
taking hours to boot, or never coming up at all.  Ap's that were up and
passing traffic would drop from the network.

The first engineer at Aruba said it looks like a network issue, and we
should look into the network switches between the ap's and the
controllers.  The engineer turned up logging, rebooted controllers and
access points, and nothing seemed to keep the ap's online for long.  I had
performed an upgrade from 8.5.0.3 to 8.7.1.1, and we downgraded the
controllers back to 8.5.0.3, and this didn't resolve the issue.

Three more days of tech support.  By now, we have upgraded to 8.7.0.0,
changed logging some more, and collected lots of log files.  I install
devices on both switches which are capturing packets, and we can clearly
see that all packets from the ap's are arriving properly at the controller,
which is discarding them.  I think we can finally stop blaming the
network.  At this point, we are beyond level one tech support, and
yesterday, even had developers on the zoom call with us.  Then one engineer
says who turned all this logging on, and turns it all off.  Within five
minutes, all access points are back online.

We reboot all of the ap's, and within five minutes, they are all back.  We
watch for a few hours, and they all stay up.  I breathe a bit easier.

It appears that in the process of trying to figure out the issues we were
seeing, we kept turning up the logging level, which increased the amount of
cpu the controller had to spend on logging, to the detriment of processing
packets for communication to access points.

We still see a few issues with communication between the ap's and the
controllers, but now, at least the ap's remain up on the redundant
tunnels.  Aruba is still working on resolving a load issue on the
controller where it's dropping packets.

Robert Spellman
*Associate Director for Network Services*
Information and Library Services
Bates College
p: 207-786-6422
a: 110 Russell Street, Lewiston, ME 04240
w: www.bates.edu  e: rspell...@bates.edu <rspellm...@bates.edu>


On Tue, Dec 29, 2020 at 8:37 AM Robert Spellman <rsp...@bates.edu> wrote:

> Our latest purchase of Aruba access points included some that required
> 8.7, so we planned on upgrading from 8.5.0.3 to 8.7.1.0 over Christmas
> break.  We have three 7220 controllers and a virtual mobility master
> running, with around 1200 access points.
>
> Thursday morning, we did the upgrade on the master and the three
> controllers.  After a reboot, around 50% of the access points failed to
> come back online.  I figured they would take some time, as they would need
> to download the new code and reboot, and maybe the controllers were a
> little busy.
>
> By afternoon, we were still around 50%.  I opened a ticket with Aruba, and
> they collected some logs, rebooted a bunch of stuff, but for the most part,
> didn't make any changes.  After rebooting everything, I had around 90% of
> the access points up.
>
> Christmas day, I found access points dropping off of the list.  They were
> still pingable, but the tunnels between the ap's and the controllers
> wouldn't fully come up.  Output from show datapath sessions would show a
> connection, but not enough to bring the access points back.
>
> Saturday, I was back on with Aruba.  After another series of log
> collection and reboots, we were back to around 90%.  I still didn't feel
> good about this, as we hadn't really made any changes.  By late evening, I
> had 9 access points online.
>
> Sunday, we moved the aruba controllers to new switches.  They had
> previously been connected to a stack of switches, all on the same layer 2
> network.  After the reboot, we again had most of the access points online,
> but as the evening wore on, the number of access points online dropped.
>
> Monday, call Aruba, wash, rinse and repeat.  At one point Monday
> afternoon, we had all access points back online.  I don't think the
> engineer made any configuration changes, just rebooted the controllers.
>
> Right now, 622 are down, 607 are up.
>
> We did revert to 8.5.0.3, but that didn't seem to help.  Right now, we are
> at 8.7.0.0.
>
>
>
> Robert Spellman
> Bates College
> Information and Library Services
>

**********
Replies to EDUCAUSE Community Group emails are sent to the entire community 
list. If you want to reply only to the person who sent the message, copy and 
paste their email address and forward the email reply. Additional participation 
and subscription information can be found at https://www.educause.edu/community

Reply via email to