I worked on a network move for a brokerage company last week and encountered a VERY strange problem.
We moved a bunch of equipment to a new office building. During the process, we changed the internal network from 192.168.100.0/24 to 172.31.4.0/22. There company has 4 Cisco 3500XL 48 port switches, with no VLANs and plain vanilla configurations. The fanciest thing is portfast on the client machine ports. Switches are linked via GBICs in a cascade. There is one client maintained router that sits before the firewall with only static routes and no routing protocols. There are multiple outside vendor routers for specific applications (real-time quotes, clearinghouse mainframe, etc.), but these too also have only static routes and no routing protocols. After installing all of the network equipment and servers, we started to turn on clients and get new DHCP addresses. Since the new network was 172.31.4.0/22, 172.31.4.1 - 172.31.4.255 was reserved for servers, printers, switches, and routers. The remaining 172.31.5.0 - 172.31.7.254 was reserved for clients...though there are only about 100 clients at the moment and thus they only took 5.0 - 5.100 or so in DHCP. After installing maybe 20 clients or so, we started to see mass slowdowns on the network. Pings between clients and servers were very irregular and intermittent. There was no discernable pattern to when pings would succeed and when they'd fail. We exhaustively went through all devices and made sure that they'd been correctly set to the new mask and that all server functions (DNS, WINS, AD, etc.) had been correctly setup for the new subnet. Everything looked fine. In an effort to troubleshoot, we unhooked the switch stack and put core servers and a few clients on a single switch. Again, communication was irregular and unpredictable, whether with static or DHCP addresses on the clients. Sometimes things would be fine, other times clients could ping the server, but not the switch to which they were attached. Sometimes clients could ping the switch, but not the server. Sometimes the clients could ping neither. Again, there seemed to be no pattern. Thinking there might have been some IOS bug, we erased nvram, upgraded the switches to current IOS code, and put in a completely plain configuration. This had no effect on the problem. After 4 of us (with probably 50 years of industry experience between us) spent 15 hours or so trying to resolve the issue, I finally suggested we try moving the clients from the 172.31.5.x/22 block to the 172.31.4.x/22 block. This solved all problems, and all clients were able to ping both switches and servers 100% of the time. Again, we didn't change the mask on anything, only the third octet of the client ip range. We then went back and triple checked every device attached to the network....servers, routers, switches, printers, clients, etc. Every single device had the correct mask (/22) except for two vendor maintained UNIX boxes...they had 172.31.4.x/24. We suspected as much earlier since clients couldn't communicate with the UNIX boxes from the beginning, but the other servers could communicate with the UNIX boxes without issue. These UNIX servers weren't running RIP(or any other RP)...and besides, there aren't any other network devices listening for RIP....so we weren't really concerned about them causing the network connectivity issues. At the time, I couldn't see how a bad mask on these boxes could effectively make the whole network unusable, so I didn't bother correcting it early in the day. At this point, I've had a week to think about the issue and I still don't have a logical reason for why this problem might have occurred. Anyone out there have any thoughts? I'm going back to put in a 3550EMI as the core in a couple of weeks. At that point, we're going to investigate more and try to move the clients back to the 172.31.5.x range. I'd like to test theories at that time if anyone can put one forward that we didn't already test....as I said, we spent a lot of time on this and I didn't put every test we did in this email. All I can offer is that it wasn't IOS code (we tried more than one version), it wasn't the switches (we tried several, including non-Cisco), it wasn't DNS, WINS, DHCP, or any other server side issue (we thoroughly examined and ruled those out...beside, this was even happening at the IP level between switches). Everything had worked correctly at the old building...the only two things that changed significantly during the move were the IP range and the building wiring. AND, the wiring in the new building was brand new Cat6...I even dug out the WireScope and verified that the drops passed spec. Thanks! Craig Message Posted at: http://www.groupstudy.com/form/read.php?f=7&i=59682&t=59682 -------------------------------------------------- FAQ, list archives, and subscription info: http://www.groupstudy.com/list/cisco.html Report misconduct and Nondisclosure violations to [EMAIL PROTECTED]

