On Apr 4, 2012, at 11:03 AM, Makewise - Vitor Rui Mendonça wrote:

> I think there's a problem regarding the time between PSE generation and 
> poller activation: if OpenNMS shutdown in that time-frame (for example), than 
> the outage is never generated/resolved.

Core problem is OpenNMS shutdown.  Sure, you can engineer yet another 
workaround in the code to this HA issue, but, what is the cost?  This isn't the 
only place where OpenNMS, or any NMS, is vulnerable to single instance 
shutdown/failure.  So, where are you going to put your effort, everywhere in 
the code?  This is going to be a huge effort and probably slow things down in 
places we can't even imagine, yet.  In this example, what if shutdown happens 
before the Trap is received?  That is just as critical as after the Trap is 
PSE'd but before the Poll… but at least you have the trap and the PSE event.  
The time you are referring, between PSE and "poller activation" is maximum 1 
sec put, statistically, 1/2 second (or so).  But, that is beside the point.  1) 
There should be an HA solution for OpenNMS and 2) OpenNMS should not ever have 
to be shutdown unless you are doing an upgrade/repair to OpenNMS, the JVM, the 
system.


> At least in my case, I have the need to restart OpenNMS sometimes, in order 
> to "commit" some changed in configuration. At the time being, our biggest 
> OpenNMS installation has almost 2000 nodes. I think this problem isn't 
> ignorable within this kind of installation.

Depending on your configuration change, you may not have to do that.  Let's 
focus on not having to restart for any configuration change.  It may be that 
your change doesn't require restart.


> Instead of check Outages, what about checking the last PSE 
> (PassiveStatusEvent) for the status parameter? Yes,  Events table tends to be 
> a big one but this operation is only done on OpenNMS startup:

That certainly is an option.  I think a combination… first check current 
outages, then check for any resolving PSE to those outages > the the outage 
time, then check for any PSE "Downs" with no matching "Ups" since the last PSE 
outage was recorded.

I'd write up an enhancement issue on this… but I still think the core problem 
is HA and restart.

David Hustace
The OpenNMS Group, Inc.

+1 919 533 0160 x7734
david_opennms (skype to my direct line)
      '\                   .  .                        |>
        \              .         ' .                   |
       O>>         .                 'o                |
        \       .                                      |
        /\    .                                        |
       / /  .'                                         |
 ^^^^^^^`^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^


------------------------------------------------------------------------------
For Developers, A Lot Can Happen In A Second.
Boundary is the first to Know...and Tell You.
Monitor Your Applications in Ultra-Fine Resolution. Try it FREE!
http://p.sf.net/sfu/Boundary-d2dvs2
_______________________________________________
Please read the OpenNMS Mailing List FAQ:
http://www.opennms.org/index.php/Mailing_List_FAQ

opennms-devel mailing list

To *unsubscribe* or change your subscription options, see the bottom of this 
page:
https://lists.sourceforge.net/lists/listinfo/opennms-devel

Reply via email to