As promised, here¹s an update on the issues from my previous email all four were addressed in H04. I¹m awaiting final confirmation from our reporting guru on the SRM issue, but I believe it¹s been resolved. Also, we¹re occasionally getting duplicate notifications again, but the debug looks different from the last time this happened, so we¹re treating it as a separate issue.
Because of these fixes (and others) we installed H04 as soon as it was available, so I can provide some early feedback. * We¹ve found two events (so far) that are no longer logged to the events database (0x1010a and 0x1010d, for model attr changes). We¹re going to now conduct a full audit of what other undocumented event changes came with H04. * It seems that our failover servers are sometimes being started as root instead of ssadmin, which is mangling file permissions. The problem started after H04, but it¹s not clear why yet, partly because it¹s only happening on some of our failover servers. * The negative alarm counts issue citied in the release notes is not (completely) fixed, and we already have a new ticket open on it. That¹s all I have so far on H04. When I have anything else of note, I¹ll definitely share it with the group. HTH, Jim -- JIM PFLEGER | Application Architect | Insight | insight.com t. 480.889.9680 f. 480.889.9599 [email protected] On 3/28/11 11:22 AM, "Pfleger, Jim" <[email protected]> wrote: > I wanted to share with everyone some of the problems we¹ve had with Spectrum > that we¹re currently working with CA in the hopes that knowing about them will > help out others. I don¹t know if they are specific to 9.2 H03, but I do know > that they all exist on this version. > > * SRM missing alarms. The SRM alarm table is missing alarms that the SRM event > and outage tables say should be there. We originally found this when going > through the outages table and trying to match up outages to alarms so we could > get their trouble ticket IDs. As we¹ve continued to work this, we¹ve been able > to craft a query that matches the number of 0x10701 events (³Alarm will be > generated²) with the number of alarms actually in SRM for the same time > period. If anyone would like to check the consistency of their SRM databases, > contact me directly for the query. CA has accepted this as an issue and is > working to determine a cause. > * Condition correlation engine does not resuppress. This one takes a bit of > explaining, so please bear with me. Suppose you have a simple network like > this: >>> SS --- A --- B >> If B goes down, you will get several alarms, including ³blade status >> unknown², ³chassis down², and ³device not responding to polls². In our >> environment, we have a condition correlation that will use ³device not >> responding² to suppress the other two, so the operators see one root cause, >> and two hidden symptoms. Now if A goes down, it will suppress B. >> Specifically, the ³device not responding² alarm on B is cleared and >> immediately replaced with an ³all device connections are unreachable² alarm. >> The problem is that, with the original ³device not responding² alarm on B now >> cleared, there is nothing to suppress the ³blade status unknown² and ³chassis >> down² alarms, which now display to the operators. Logically, the alarm on A >> that is suppressing B should also be used to resuppress all the alarms on B, >> but this doesn¹t happen. CA has accepted this as an issue, and created a >> patch that we¹re currently testing. > * Notifier duplicating or not sending alarms. We¹ve received two different > patches for Notifier one was for it occasionally acting on an alarm twice > (about 1.5% of alarms), and the other was for it not acting on an alarm at all > (about 0.1% of alarms). These patches were merged together into D89a. > > These are the ones that I think will be of wide interest. When I have > substantial updates to share with the group, I will definitely do so. > > If you¹re seeing other strange behaviors, or have any questions about these, > please contact me to discuss. I think that an open flow of information about > these sorts of issues benefits us all. > > Jim > --- To unsubscribe from spectrum, send email to [email protected] with the body: unsubscribe spectrum [email protected]
