They should be offset by a fixed amount. Ie subtract 4 On Oct 25, 2014 10:58 AM, "Bill Prince via Af" <af@afmug.com> wrote:
> I think that may be it. The OID I was using is no longer valid. So the > SNMP response that came back had numbers in it, but it also looks like the > checksum was broken. > > Not clear to me why I thought I could do this without doing the index > thing. > > I hate doing the index thing. > > bp > > On 10/24/2014 10:32 PM, Forrest Christian (List Account) via Af wrote: > > A power cycle and a reboot should be identical in almost every case. The > reboot actually triggers a hardware reset internally in the processor, > which should clear everything out. Of course as soon as I say that it is > identical, someone will find an example where it is not. > > I'm not where I can look at the trace you sent, but I'm surprised it > contains errors. I do know that the unit will return a response which may > look like this if the oid is invalid. > > Did you adjust your oids in cacti after the removal of the mystery > expansion unit from the table? If not, this is likely the problem. > > In regards to the unit being there grin the factory.. My guess is if you > had this unit listed in there from the get go, then it probably was the > expansion unit we use to test the expansion bus here. It's supposed to be > factory reset before shipping but it would not shock me if it wasn't. We > actually had a short period that a largish percentage went out not factory > reset due to a tester software issue. Not really a problem but we hate to > have them go out in any other state. > On Oct 24, 2014 5:08 PM, "Bill Prince via Af" <af@afmug.com> wrote: > >> You mean from the web GUI?� Sure. >> >> I presume a power cycle does something different from a reboot? >> >> I was always curious about this particular SiteMonitor, as it came up >> with the extra device on the expansion bus from the get-go.� I'd never >> worried about it, and then I saw the discussion about getting rid of old >> devices with the zeroed-serial trick. >> >> Don't go there!� It's a trap! >> >> bp >> >> On 10/24/2014 2:52 PM, George Skorup (Cyber Broadcasting) via Af wrote: >> >> Can you post a screenshot of your expansion, binary and analog tabs? >> >> Also, I bet if you power-cycle it, it will be fine again. I was working >> with Forrest on a bug where the SyncInjector and some other newer modules >> would mysteriously disappear from the bus. He was able to reproduce and get >> a fixed up firmware load for the modules. Something about one thing booting >> up faster than another, or something like that. >> >> On 10/24/2014 4:41 PM, Bill Prince via Af wrote: >> >> Gotcha! >> >> I removed all the Data Sources except one (PWR1).� Suddenly that data >> was making it into cacti. >> >> Then I added back in all the Data Sources coming _JUST_ from the >> SiteMonitor itself.� That also worked. >> >> Then I added in one of the Data Sources from the SyncInjector (sync >> events), which happens to be the only unit on the expansion bus past where >> I removed the non-existent unit.� This broke it again. >> >> So I have apparently uncovered a bug where removing a unit from the >> expansion bus (by zeroing the serial number) that causes the SiteMonitor to >> break SNMP responses.� I think it's probably just a bad checksum, but I >> will leave that up to him.� I forwarded the pcap trace to him. >> >> I will probably also swap out the SiteMonitor that has the problem. >> >> Thanks guys! >> >> bp >> >> On 10/24/2014 1:57 PM, Bill Prince via Af wrote: >> >> Then again.... >> >> Not sure why I didn't notice this the first (or second) time.� >> Wireshark is telling me I have a malformed packet; either a broken header >> or bad checksum.� So even though the SNMP response is coming in with the >> expected data, it's getting dropped before is gets into cacti because of >> the malformed packet. >> >> This would explain why removing a unit on the expansion bus changed >> things... >> >> bp >> >> >> >> >> On 10/24/2014 1:32 PM, Bill Prince via Af wrote: >> >> OK. Confirmed.� The SiteMonitor is getting the SNMP requests, and it is >> responding with the expected values. >> >> I ran a pcap trace both at the SiteMonitor as well as at the ethernet >> port on the cacti server.� SNMP requests/responses are going both ways >> (and at both ends). In fact, spine appears to be doing 3 retries. >> >> One thing I didn't expect is that just before the SNMP requests, there >> are two attempts to open a telnet on the SiteMonitor.� Not sure where >> that is coming from, except perhaps for the Manage plugin (which I >> de-installed several weeks ago). >> >> So something is broken inside cacti.� How/why this was caused by >> zeroing a serial number from a non-existent expansion unit is completely >> baffling to me. >> >> I also have no clue how to fix it, because cacti "thinks" there was no >> response. >> >> bp >> >> On 10/24/2014 11:16 AM, George Skorup (Cyber Broadcasting) via Af wrote: >> >> I am thoroughly confused. Is your community string correct? Can you >> increase the device SNMP timeout, like 1000ms instead of 250ms. What's your >> device down detection set to? Is it showing down in the device list? >> >> I have seen some base units go kinda screwy and respond slower and a >> reboot doesn't fix it, they needed a power-cycle. >> >> On 10/24/2014 11:25 AM, Bill Prince via Af wrote: >> >> Now thrice. >> >> No joy in Mudville. >> >> bp >> >> On 10/24/2014 8:07 AM, Bill Prince via Af wrote: >> >> Yah.� Twice now. >> >> bp >> >> On 10/23/2014 11:06 PM, George Skorup (Cyber Broadcasting) via Af wrote: >> >> Gotta be the poller cache. Did you try a rebuild? >> >> On 10/23/2014 11:03 PM, Bill Prince via Af wrote: >> >> Getting closer.� When I look in the SNMP cache, there is no entry for >> the device. >> >> Looking in the log (without debug), I get: >> >> 10/23/2014 08:34:25 PM - SPINE: Poller[0] Host[797 >> <http://10.13.112.20/host.php?action=edit&id=797>] TH[1] DS[12316 >> <http://10.13.112.20/data_sources.php?action=ds_edit&id=12316>] WARNING: >> SNMP timeout detected [250 ms], ignoring host '10.13.114.254' >> >> So there is something causing the SNMP request to barf inside cacti.� >> When I do an snmpget from the CLI, it all looks fine.� Likewise, the >> realtime plugin is working fine too. >> >> So when realtime is doing the SNMP queries outside the poller, they are >> fine.� Just when spine is doing the SNMP requests. >> >> >> bp >> >> On 10/23/2014 4:12 PM, George Skorup (Cyber Broadcasting) via Af wrote: >> >> You divided by zero, didn't you? >> >> Are you sure your modules are in the same order as before? >> >> On 10/23/2014 1:29 PM, Bill Prince via Af wrote: >> >> >> I noticed an "Expansion Unit" on one of my SiteMonitors this morning.� >> It said something about "Device Removed" or something like that. >> >> Remembering the discussion the other day on this topic, I put a "0" in >> the Serial # for the non-existent unit, rescanned, & rebooted. >> >> Now, none of the OIDs work in Cacti.� If I do a simple snmpget on any >> of the OIDs that I use, the correct information comes back. Several of the >> OIDs are on the base unit anyway, so they would not have moved, and >> further, the OIDs don't reference the serial number. >> >> So... what did I do, and how do I fix it? >> >> >> >> >> >> >> >> >> >> >> >> >> >> >