Re: Very Strange Problem....Any Ideas? [7:59682]

Craig Columbus Mon, 23 Dec 2002 05:58:36 -0800

Since you worked in a brokerage situation, you probably understand better 
than most...
It's straight IP, but there is equipment from Bloomberg, Metavante, 
Reuters, Pershing, etc.
Most of the equipment, I have to take the vendor's word that it's not 
misbehaving since I don't have access.
Supposedly, each vendor checked their equipment and ruled out all the 
things we asked them to check.



At 07:30 AM 12/22/2002 +0000, you wrote:
>Craig, I looked through the other responses, but I wanted to offer
>something. I worked in brokerage for a number of years, most of which time I
>was in the beginner's level regarding networking. But I do recall some
>"strange" things happening, and I never did trust the answers particular
>venders were giving me.
>
>First question - when you say "vendor" Unix boxes, are you talking ILX
>systems? IP only box? no port to IPX, I assume.
>
>Second question - is there a firewall someplace in the mix?
>
>Third question - any other vendor equipment - say a Bloomberg router or a
>Bridge Networks server, or maybe a Telerate or two?
>
>Any other Thomson equipment in the mix?
>
>I had a problem once with what ILX told me was a routing loop. I'd have to
>sit back and think a long time about the topology I had in place. The
>problem only occurred with a particular branch that I was moving from a
>bridged to a routed WAN link.
>
>Another time, when I was testing using centralized ILX services ( servers at
>HQ, workstations in remote offices ), ILX used to blame the failure to
>operate properly on IP helpering which I had in place for DHCP purposes.
>They also used to claim that my RIP passive on my PIX firewall was
>interfering with their servers. I can buy the routing loop, but I never did
>buy their IP helpering and PIX finger pointing. Again, I'd have to sit back
>and think a while. It's been over three years now.
>
>I asked about other vendors, because you never can tell when a misconfigured
>redistribution or some static route from 3rd party equipment might creep
>into the mix.
>
>Let us all know. Especially me. I still have a soft spot in my heart for
>brokerage.
>
>Chuck
>
>--
>TANSTAAFL
>"there ain't no such thing as a free lunch"
>
>
>
>
>""Craig Columbus""  wrote in message
>[EMAIL PROTECTED]">news:[EMAIL PROTECTED]...
> > I worked on a network move for a brokerage company last week and
> > encountered a VERY strange problem.
> >
> > We moved a bunch of equipment to a new office building.  During the
> > process, we changed the internal network from 192.168.100.0/24 to
> > 172.31.4.0/22.
> > There company has 4 Cisco 3500XL 48 port switches, with no VLANs and
plain
> > vanilla configurations.  The fanciest thing is portfast on the client
> > machine ports.
> > Switches are linked via GBICs in a cascade.  There is one client
>maintained
> > router that sits before the firewall with only static routes and no
>routing
> > protocols.
> > There are multiple outside vendor routers for specific applications
> > (real-time quotes, clearinghouse mainframe, etc.), but these too also
have
> > only static routes and no routing protocols.
> >
> > After installing all of the network equipment and servers, we started to
> > turn on clients and get new DHCP addresses.  Since the new network was
> > 172.31.4.0/22, 172.31.4.1 - 172.31.4.255 was reserved for servers,
> > printers, switches, and routers.  The remaining 172.31.5.0 - 172.31.7.254
> > was reserved for clients...though there are only about 100 clients at the
> > moment and thus they only took 5.0 - 5.100 or so in DHCP.
> >
> > After installing maybe 20 clients or so, we started to see mass slowdowns
> > on the network.  Pings between clients and servers were very irregular
and
> > intermittent.  There was no discernable pattern to when pings would
>succeed
> > and when they'd fail.  We exhaustively went through all devices and made
> > sure that they'd been correctly set to the new mask and that all server
> > functions (DNS, WINS, AD, etc.) had been correctly setup for the new
> > subnet.  Everything looked fine.  In an effort to troubleshoot, we
>unhooked
> > the switch stack and put core servers and a few clients on a single
> > switch.  Again, communication was irregular and unpredictable, whether
>with
> > static or DHCP addresses on the clients.  Sometimes things would be fine,
> > other times clients could ping the server, but not the switch to which
>they
> > were attached.  Sometimes clients could ping the switch, but not the
> > server.  Sometimes the clients could ping neither.  Again, there seemed
to
> > be no pattern.  Thinking there might have been some IOS bug, we erased
> > nvram, upgraded the switches to current IOS code, and put in a completely
> > plain configuration.  This had no effect on the problem.
> >
> > After 4 of us (with probably 50 years of industry experience between us)
> > spent 15 hours or so trying to resolve the issue, I finally suggested we
> > try moving the clients from the 172.31.5.x/22 block to the 172.31.4.x/22
> > block.  This solved all problems, and all clients were able to ping both
> > switches and servers 100% of the time.  Again, we didn't change the mask
>on
> > anything, only the third octet of the client ip range.  We then went back
> > and triple checked every device attached to the network....servers,
> > routers, switches, printers, clients, etc.  Every single device had the
> > correct mask (/22) except for two vendor maintained UNIX boxes...they had
> > 172.31.4.x/24.  We suspected as much earlier since clients couldn't
> > communicate with the UNIX boxes from the beginning, but the other servers
> > could communicate with the UNIX boxes without issue.  These UNIX servers
> > weren't running RIP(or any other RP)...and besides, there aren't any
other
> > network devices listening for RIP....so we weren't really concerned about
> > them causing the network connectivity issues.  At the time, I couldn't
see
> > how a bad mask on these boxes could effectively make the whole network
> > unusable, so I didn't bother correcting it early in the day.
> >
> > At this point, I've had a week to think about the issue and I still don't
> > have a logical reason for why this problem might have occurred.  Anyone
>out
> > there have any thoughts?
> > I'm going back to put in a 3550EMI as the core in a couple of weeks.  At
> > that point, we're going to investigate more and try to move the clients
> > back to the 172.31.5.x range.  I'd like to test theories at that time if
> > anyone can put one forward that we didn't already test....as I said, we
> > spent a lot of time on this and I didn't put every test we did in this
> > email.  All I can offer is that it wasn't IOS code (we tried more than
one
> > version), it wasn't the switches (we tried several, including non-Cisco),
> > it wasn't DNS, WINS, DHCP, or any other server side issue (we thoroughly
> > examined and ruled those out...beside, this was even happening at the IP
> > level between switches).  Everything had worked correctly at the old
> > building...the only two things that changed significantly during the move
> > were the IP range and the building wiring.  AND, the wiring in the new
> > building was brand new Cat6...I even dug out the WireScope and verified
> > that the drops passed spec.
> >
> > Thanks!
> > Craig




Message Posted at:
http://www.groupstudy.com/form/read.php?f=7&i=59745&t=59682
--------------------------------------------------
FAQ, list archives, and subscription info: http://www.groupstudy.com/list/cisco.html
Report misconduct and Nondisclosure violations to [EMAIL PROTECTED]

Re: Very Strange Problem....Any Ideas? [7:59682]

Reply via email to