Hi All,

Just wondering if anyone has had an similar experiences to the fun we've had 
the last week or so.

Towards the end of last year we moved to new 8510 HA pair on 8.2.121.11 (we had 
an issue in testing at the time so grabbed the latest ER release that resolved 
a crash bug)
>From 5x5508's in N+1 on 8.0.121.0 code
We started before the end of term with a small number of locations but didn't 
fully load it up until the big break. Now the students are back and needing 
there internet we have had some real load issues during the day.

SO it's 2x 8510's in HA about 2100 AP's peaking at about 14k concurrent clients 
but the issue seems to creep in at about 10k. While ICMP isn't the greatest 
tool for performance it does line up here, the graph below show around 10am we 
see increased delays in response to the vlan42 (client network) interface on 
the controller and we see this on its management interface too. At this point 
our clients ICMP to its  own gateway starts to increase  from 1-3ms to 400-600 
and even upto 1800 when the big spike shows 800ms to the interface. Iperf 
testing will also go from 100Mb down to 1-5 and even 0 at times. With users 
complaining of slowness and it's worse unable to login.

CPU/Memory resources, channel util etc all ok. It's site wide impact to users 
no matter if it's HD rf design or what AP model (1142, 2702,3702,3502 etc) So 
seems in the controller itself. All testing done on 5hz

Around midday we started migrating AP's away to our old 5508's, which saw a 
significant drop just before 12:30 and things back to normal at 12:40  once 
300AP's were moved off. So for now users are happy, apparently we've even had 
callers in saying how good it is today (must have been bad the last week for 
that to happen). Controller response to SNMP was so bad it was taking Prime 2 
minutes per AP to re-configure primary controller. Did it by hand, ssh/gui 
response was not it's normal self but no problem. The 5508's have shown no 
signs of being unhappy with about 150 AP's each.

We are working with TAC who have been good and they are investigating(no like 
cases found though), shedding the load has worked around the issue but it needs 
fixing. We upgraded to 8.2.141.0 yesterday evening but won't be re-loading the 
8510's until next week so confirming it's fixed is a few days off. There's a 
few short upto 30ms delayed ICMP responses today but it's hard to know if 
that's related or just the nature of icmp and network gear priority.

Interested to know if anyone has seen anything like this in their environment.
And anyone if anyone out there is using 8510's in HA what's your load in AP and 
concurrent users? I can imagine many places loading their devices up more than 
us
Anyone know how to look at other hardware resources (not CPU/memory/system 
buffers) Something like ASIC on switches if it exists. Surely all this traffic 
isn't cpu

Thanks

Jason

[cid:image001.jpg@01D298E3.B5C5C140]
--
Jason Cook
Technology Services
The University of Adelaide, AUSTRALIA 5005
Ph    : +61 8 8313 4800
e-mail: 
jason.c...@adelaide.edu.au<mailto:jason.c...@adelaide.edu.au<mailto:jason.c...@adelaide.edu.au%3cmailto:jason.c...@adelaide.edu.au>>

CRICOS Provider Number 00123M
-----------------------------------------------------------
This email message is intended only for the addressee(s) and contains 
information which may be confidential and/or copyright.  If you are not the 
intended recipient please do not read, save, forward, disclose, or copy the 
contents of this email. If this email has been sent to you in error, please 
notify the sender by reply email and delete this email and any copies or links 
to this email completely and immediately from your system.  No representation 
is made that this email is free of viruses.  Virus scanning is recommended and 
is the responsibility of the recipient.


**********
Participation and subscription information for this EDUCAUSE Constituent Group 
discussion list can be found at http://www.educause.edu/discuss.

Reply via email to