Take the interface template for example. The default index is ifDescr. This is based on the standard IF-MIB. So that's one way to do it, base it on the binary/analog description field, which thankfully Forrest made editable. I set a custom description on every PoE port. Like you said, if you have two of the same module on a base unit, you get dupes. So if you do that, then if modules change order, or they're different from one base unit to another, it doesn't matter since it's based on a name.

You can also just go with the Idx field and if you change a module position, then just update the Cacti data source with the new Idx value. Fairly simple. Still requires a "Get SNMP Data (indexed)" method though.

I might start taking a look at it here this evening. I'm thinking I'll just go with the description field.

On 10/25/2014 3:00 PM, Bill Prince via Af wrote:
I suppose I could make several standard configurations. The configuration would include the mix of devices plus their position.

I'm comparing a couple of sites that are identical, except that one has a 4-port POE, and the other has an 8-port POE, both in slot 1. Both have slot 2 as a syncinjector.

Problem is, I've never paid particular attention to what the order was.

My bad.

I like making all the different devices, the SiteMonitor + position.

Only I'm not sure how to do that...

back to the drawing board.

bp
On 10/25/2014 12:28 PM, Forrest Christian (List Account) via Af wrote:

Most people end up with a set of three or four configurations. Ie sitemonitor plus a injector is one configuration, a sitemonitor by itself is another one.

If you put the modules you don't ever monitor at the end of the list then you can reuse configurations. Ie, a sitemonitor and syncinjector is the same as a sitemonitor, syncinjector, and Poe as far as monitoring goes.

On Oct 25, 2014 1:06 PM, "Bill Prince via Af" <af@afmug.com <mailto:af@afmug.com>> wrote:

    OK.  I think I have an approach. The SiteMonitor plus all its
    expansion units is not the "device".

    The "device" is the SiteMonitor plus the index of the expansion unit.

    For example:

      * SiteMonitor, index 0 is the SiteMonitor device
      * SiteMonitor, index 1 is the 4-port POE device
      * SiteMonitor, index 2 is the SyncInjector (first instance)
      * SiteMonitor, index 3 is the SyncInjector (second instance)

    and so on.

    So when you add a SiteMonitor, you just add the SiteMonitor. If
    you add another Packetflux expansion unit, you have to add it
    knowing which index (AKA "slot") it is.  Put the device in a
    different position, and you need to update the index.

    bp

    On 10/25/2014 10:52 AM, Bill Prince via Af wrote:
    Yah.  Except that the index moves around, depending on what's in
    front of it (e.g. 4-port POE versus an 8-port POE).  So I can't
    depend on what index number I'll be using at any given
    installation.  The index name will have to stay static if I ever
    hope to find it.  Then again, if I install two of anything,
    there will be more than one index with the same description.

    Hmmm.  How to do this.   Maybe I do have to give each device a
    unique description, and then teach cacti to index on the unique
    description?

    bp
    On 10/25/2014 10:16 AM, Forrest Christian (List Account) via Af
    wrote:

    They should be offset by a fixed amount. Ie subtract 4

    On Oct 25, 2014 10:58 AM, "Bill Prince via Af" <af@afmug.com
    <mailto:af@afmug.com>> wrote:

        I think that may be it.  The OID I was using is no longer
        valid.  So the SNMP response that came back had numbers in
        it, but it also looks like the checksum was broken.

        Not clear to me why I thought I could do this without doing
        the index thing.

        I hate doing the index thing.

        bp

        On 10/24/2014 10:32 PM, Forrest Christian (List Account)
        via Af wrote:

        A power cycle and a reboot should be identical in almost
        every case.  The reboot actually triggers a hardware reset
        internally in the processor, which should clear everything
        out.  Of course as soon as I say that it is identical,
        someone will find an example where it is not.

        I'm not where I can look at the trace you sent, but I'm
        surprised it contains errors.  I do know that the unit
        will return a response which may look like this if the oid
        is invalid.

        Did you adjust your oids in cacti after the removal of the
        mystery expansion unit from the table?  If not, this is
        likely the problem.

        In regards to the unit being there grin the factory..  My
        guess is if you had this unit listed in there from the get
        go, then it probably was the expansion unit we use to test
        the expansion bus here.  It's supposed to be factory reset
before shipping but it would not shock me if it wasn't. We actually had a short period that a largish percentage
        went out not factory reset due to a tester software
        issue.   Not really a problem but we hate to have them go
        out in any other state.

        On Oct 24, 2014 5:08 PM, "Bill Prince via Af"
        <af@afmug.com <mailto:af@afmug.com>> wrote:

            You mean from the web GUI?� Sure.

            I presume a power cycle does something different from
            a reboot?

            I was always curious about this particular
            SiteMonitor, as it came up with the extra device on
            the expansion bus from the get-go.� I'd never
            worried about it, and then I saw the discussion about
            getting rid of old devices with the zeroed-serial trick.

            Don't go there!� It's a trap!

            bp

            On 10/24/2014 2:52 PM, George Skorup (Cyber
            Broadcasting) via Af wrote:
            Can you post a screenshot of your expansion, binary
            and analog tabs?

            Also, I bet if you power-cycle it, it will be fine
            again. I was working with Forrest on a bug where the
            SyncInjector and some other newer modules would
            mysteriously disappear from the bus. He was able to
            reproduce and get a fixed up firmware load for the
            modules. Something about one thing booting up faster
            than another, or something like that.

            On 10/24/2014 4:41 PM, Bill Prince via Af wrote:
            Gotcha!

            I removed all the Data Sources except one (PWR1).�
            Suddenly that data was making it into cacti.

            Then I added back in all the Data Sources coming
            _JUST_ from the SiteMonitor itself.� That also worked.

            Then I added in one of the Data Sources from the
            SyncInjector (sync events), which happens to be the
            only unit on the expansion bus past where I removed
            the non-existent unit.� This broke it again.

            So I have apparently uncovered a bug where removing
            a unit from the expansion bus (by zeroing the serial
            number) that causes the SiteMonitor to break SNMP
            responses.� I think it's probably just a bad
            checksum, but I will leave that up to him.� I
            forwarded the pcap trace to him.

            I will probably also swap out the SiteMonitor that
            has the problem.

            Thanks guys!

            bp
            On 10/24/2014 1:57 PM, Bill Prince via Af wrote:
            Then again....

            Not sure why I didn't notice this the first (or
            second) time.� Wireshark is telling me I have a
            malformed packet; either a broken header or bad
            checksum.� So even though the SNMP response is
            coming in with the expected data, it's getting
            dropped before is gets into cacti because of the
            malformed packet.

            This would explain why removing a unit on the
            expansion bus changed things...
            bp



            On 10/24/2014 1:32 PM, Bill Prince via Af wrote:
            OK. Confirmed.� The SiteMonitor is getting the
            SNMP requests, and it is responding with the
            expected values.

            I ran a pcap trace both at the SiteMonitor as well
            as at the ethernet port on the cacti server.�
            SNMP requests/responses are going both ways (and
            at both ends). In fact, spine appears to be doing
            3 retries.

            One thing I didn't expect is that just before the
            SNMP requests, there are two attempts to open a
            telnet on the SiteMonitor.� Not sure where that
            is coming from, except perhaps for the Manage
            plugin (which I de-installed several weeks ago).

            So something is broken inside cacti.� How/why
            this was caused by zeroing a serial number from a
            non-existent expansion unit is completely baffling
            to me.

            I also have no clue how to fix it, because cacti
            "thinks" there was no response.
            bp
            On 10/24/2014 11:16 AM, George Skorup (Cyber
            Broadcasting) via Af wrote:
            I am thoroughly confused. Is your community
            string correct? Can you increase the device SNMP
            timeout, like 1000ms instead of 250ms. What's
            your device down detection set to? Is it showing
            down in the device list?

            I have seen some base units go kinda screwy and
            respond slower and a reboot doesn't fix it, they
            needed a power-cycle.

            On 10/24/2014 11:25 AM, Bill Prince via Af wrote:
            Now thrice.

            No joy in Mudville.

            bp
            On 10/24/2014 8:07 AM, Bill Prince via Af wrote:
            Yah.� Twice now.

            bp
            On 10/23/2014 11:06 PM, George Skorup (Cyber
            Broadcasting) via Af wrote:
            Gotta be the poller cache. Did you try a rebuild?

            On 10/23/2014 11:03 PM, Bill Prince via Af wrote:
            Getting closer.� When I look in the SNMP
            cache, there is no entry for the device.

            Looking in the log (without debug), I get:

            10/23/2014 08:34:25 PM - SPINE: Poller[0]
            Host[797
            <http://10.13.112.20/host.php?action=edit&id=797>]
            TH[1] DS[12316
            <http://10.13.112.20/data_sources.php?action=ds_edit&id=12316>]
            WARNING: SNMP timeout detected [250 ms],
            ignoring host '10.13.114.254'

            So there is something causing the SNMP
            request to barf inside cacti.� When I do an
            snmpget from the CLI, it all looks fine.�
            Likewise, the realtime plugin is working fine
            too.

            So when realtime is doing the SNMP queries
            outside the poller, they are fine.� Just
            when spine is doing the SNMP requests.


            bp
            On 10/23/2014 4:12 PM, George Skorup (Cyber
            Broadcasting) via Af wrote:
            You divided by zero, didn't you?

            Are you sure your modules are in the same
            order as before?

            On 10/23/2014 1:29 PM, Bill Prince via Af
            wrote:

            I noticed an "Expansion Unit" on one of my
            SiteMonitors this morning.� It said
            something about "Device Removed" or
            something like that.

            Remembering the discussion the other day on
            this topic, I put a "0" in the Serial # for
            the non-existent unit, rescanned, & rebooted.

            Now, none of the OIDs work in Cacti.� If
            I do a simple snmpget on any of the OIDs
            that I use, the correct information comes
            back. Several of the OIDs are on the base
            unit anyway, so they would not have moved,
            and further, the OIDs don't reference the
            serial number.

            So... what did I do, and how do I fix it?


















Reply via email to