As promised, here¹s an update on the issues from my previous email ­ all
four were addressed in H04. I¹m awaiting final confirmation from our
reporting guru on the SRM issue, but I believe it¹s been resolved. Also,
we¹re occasionally getting duplicate notifications again, but the debug
looks different from the last time this happened, so we¹re treating it as a
separate issue.

Because of these fixes (and others) we installed H04 as soon as it was
available, so I can provide some early feedback.

* We¹ve found two events (so far) that are no longer logged to the events
database (0x1010a and 0x1010d, for model attr changes). We¹re going to now
conduct a full audit of what other undocumented event changes came with H04.
* It seems that our failover servers are sometimes being started as root
instead of ssadmin, which is mangling file permissions. The problem started
after H04, but it¹s not clear why yet, partly because it¹s only happening on
some of our failover servers.
* The negative alarm counts issue citied in the release notes is not
(completely) fixed, and we already have a new ticket open on it.

That¹s all I have so far on H04. When I have anything else of note, I¹ll
definitely share it with the group.

HTH,
Jim


-- 
JIM PFLEGER  |  Application Architect  |  Insight  |  insight.com

t. 480.889.9680 f. 480.889.9599  [email protected]




On 3/28/11 11:22 AM, "Pfleger, Jim" <[email protected]> wrote:

> I wanted to share with everyone some of the problems we¹ve had with Spectrum
> that we¹re currently working with CA in the hopes that knowing about them will
> help out others. I don¹t know if they are specific to 9.2 H03, but I do know
> that they all exist on this version.
> 
> * SRM missing alarms. The SRM alarm table is missing alarms that the SRM event
> and outage tables say should be there. We originally found this when going
> through the outages table and trying to match up outages to alarms so we could
> get their trouble ticket IDs. As we¹ve continued to work this, we¹ve been able
> to craft a query that matches the number of 0x10701 events (³Alarm will be
> generated²) with the number of alarms actually in SRM for the same time
> period. If anyone would like to check the consistency of their SRM databases,
> contact me directly for the query. CA has accepted this as an issue and is
> working to determine a cause.
> * Condition correlation engine does not resuppress. This one takes a bit of
> explaining, so please bear with me. Suppose you have a simple network like
> this:
>>> SS --- A --- B
>> If B goes down, you will get several alarms, including ³blade status
>> unknown², ³chassis down², and ³device not responding to polls². In our
>> environment, we have a condition correlation that will use ³device not
>> responding² to suppress the other two, so the operators see one root cause,
>> and two hidden symptoms. Now if A goes down, it will suppress B.
>> Specifically, the ³device not responding² alarm on B is cleared and
>> immediately replaced with an ³all device connections are unreachable² alarm.
>> The problem is that, with the original ³device not responding² alarm on B now
>> cleared, there is nothing to suppress the ³blade status unknown² and ³chassis
>> down² alarms, which now display to the operators. Logically, the alarm on A
>> that is suppressing B should also be used to resuppress all the alarms on B,
>> but this doesn¹t happen. CA has accepted this as an issue, and created a
>> patch that we¹re currently testing.
> * Notifier duplicating or not sending alarms. We¹ve received two different
> patches for Notifier ­ one was for it occasionally acting on an alarm twice
> (about 1.5% of alarms), and the other was for it not acting on an alarm at all
> (about 0.1% of alarms). These patches were merged together into D89a.
> 
> These are the ones that I think will be of wide interest. When I have
> substantial updates to share with the group, I will definitely do so.
> 
> If you¹re seeing other strange behaviors, or have any questions about these,
> please contact me to discuss. I think that an open flow of information about
> these sorts of issues benefits us all.
> 
> Jim
> 


---
To unsubscribe from spectrum, send email to [email protected] with the body: 
unsubscribe spectrum [email protected]

Reply via email to