[tickets] [opensaf:tickets] #536 LOG: Incorrect handling of "partial write" when writing log record to file

2013-08-15 Thread elunlen
- **assigned_to**: elunlen



---

** [tickets:#536] LOG: Incorrect handling of "partial write" when writing log 
record to file**

**Status:** unassigned
**Created:** Thu Aug 08, 2013 10:59 AM UTC by elunlen
**Last Updated:** Thu Aug 15, 2013 01:54 PM UTC
**Owner:** elunlen

In the "partial write check" number of byte written is checked against 
params_in fixedLogRecordSize instead of the actual number of bytes in the 
buffer to write. The buffer should always contain the number of bytes that 
corresponds to the setting in params_in fixedLogRecordSize.



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Get 100% visibility into Java/.NET code with AppDynamics Lite!
It's a free troubleshooting tool designed for production.
Get down to code-level detail for bottlenecks, with <2% overhead. 
Download for free and get started troubleshooting in minutes. 
http://pubads.g.doubleclick.net/gampad/clk?id=48897031&iu=/4140/ostg.clktrk___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #536 LOG: Incorrect handling of "partial write" when writing log record to file

2013-08-15 Thread elunlen
This ticket shall be fixed for 4.2 and 4.3. For devel branch (4.4) this problem 
is fixed with ticket [#9].


---

** [tickets:#536] LOG: Incorrect handling of "partial write" when writing log 
record to file**

**Status:** unassigned
**Created:** Thu Aug 08, 2013 10:59 AM UTC by elunlen
**Last Updated:** Thu Aug 08, 2013 10:59 AM UTC
**Owner:** nobody

In the "partial write check" number of byte written is checked against 
params_in fixedLogRecordSize instead of the actual number of bytes in the 
buffer to write. The buffer should always contain the number of bytes that 
corresponds to the setting in params_in fixedLogRecordSize.



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Get 100% visibility into Java/.NET code with AppDynamics Lite!
It's a free troubleshooting tool designed for production.
Get down to code-level detail for bottlenecks, with <2% overhead. 
Download for free and get started troubleshooting in minutes. 
http://pubads.g.doubleclick.net/gampad/clk?id=48897031&iu=/4140/ostg.clktrk___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #130 LOG: Logd crashed on active controller when saLogStreamLogFullHaltThreshold value is changed to invalid for configured application streams

2013-08-15 Thread elunlen
- **status**: assigned --> unassigned
- **assigned_to**: carl johannesson --> elunlen



---

** [tickets:#130] LOG: Logd crashed on active controller when 
saLogStreamLogFullHaltThreshold value is changed to invalid for configured 
application streams**

**Status:** unassigned
**Created:** Mon May 13, 2013 09:52 AM UTC by elunlen
**Last Updated:** Mon May 13, 2013 09:59 AM UTC
**Owner:** elunlen

Logd crashed on active controller when the value for of the 
saLogStreamLogFullHaltThreshold is modified to invalid for the configured 
application streams.

Steps to Reproduce:
===
1)Created application streams using immcfg.
2)Modified the saLogStreamLogFullHaltThreshold attribute value from default 
value to invalid value through immcfg.

After changing the attribute value logd got crashed on active controller and 
node went for reboot.

==

snippet from the /var/log/messages.
===
Feb 25 12:39:06 SLES-64BIT-SLOT1 osafimmnd[2493]: NO Implementer connected: 15 
(MsgQueueService131599) <0, 2020f>
Feb 25 12:39:35 SLES-64BIT-SLOT1 sshd[3047]: Accepted keyboard-interactive/pam 
for root from 192.168.56.1 port 33674 ssh2
Feb 25 12:41:13 SLES-64BIT-SLOT1 osafimmnd[2493]: NO Ccb 2 COMMITTED 
(immcfg_SLES-64BIT-SLOT4_2819)
Feb 25 12:41:13 SLES-64BIT-SLOT1 osafamfnd[2586]: NO 
'safComp=LOG,safSu=SC-1,safSg=2N,safApp=OpenSAF' faulted due to 'avaDown' : 
Recovery is 'nodeFailfast'
Feb 25 12:41:13 SLES-64BIT-SLOT1 osafamfnd[2586]: ER 
safComp=LOG,safSu=SC-1,safSg=2N,safApp=OpenSAF Faulted due to:avaDown Recovery 
is:nodeFailfast
Feb 25 12:41:13 SLES-64BIT-SLOT1 osafamfnd[2586]: Rebooting OpenSAF NodeId = 
131343 EE Name = , Reason: Component faulted: recovery is node failfast
Feb 25 12:41:13 SLES-64BIT-SLOT1 kernel: [  291.046073] osaflogd[2530]: 
segfault at 0 ip 00410201 sp 7fffe31bdc80 error 4 in 
osaflogd[40+26000]
Feb 25 12:41:13 SLES-64BIT-SLOT1 osafimmnd[2493]: NO Implementer locally 
disconnected. Marking it as doomed 1 <4, 2010f> (safLogService)
Feb 25 12:41:13 SLES-64BIT-SLOT1 osafimmnd[2493]: NO Implementer disconnected 1 
<4, 2010f> (safLogService)
Feb 25 12:41:13 SLES-64BIT-SLOT1 opensaf_reboot: Rebooting local node
===

Core file is not generated.

Migrated from devel.opensaf.org #3024


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Get 100% visibility into Java/.NET code with AppDynamics Lite!
It's a free troubleshooting tool designed for production.
Get down to code-level detail for bottlenecks, with <2% overhead. 
Download for free and get started troubleshooting in minutes. 
http://pubads.g.doubleclick.net/gampad/clk?id=48897031&iu=/4140/ostg.clktrk___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #448 LOG: Incorrect validity check for saLogStreamFileName

2013-08-15 Thread elunlen
A possible solution is to construct a path to the .cfg file using root 
path + relative path + file name + .cfg
example:  +  +  + <.cfg>



---

** [tickets:#448] LOG: Incorrect validity check for saLogStreamFileName**

**Status:** unassigned
**Created:** Mon Jun 10, 2013 12:01 PM UTC by elunlen
**Last Updated:** Thu Aug 15, 2013 01:16 PM UTC
**Owner:** elunlen

Validity check when changing saLogStreamFileName via IMM adm interace is 
incorrect. Will always return information saying that the file name does not 
already exists. The content of attribute saLogStreamFileName is used for 
checking not a complete path to any file.
See file lgs_imm.c function check_attr_validity(..)



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Get 100% visibility into Java/.NET code with AppDynamics Lite!
It's a free troubleshooting tool designed for production.
Get down to code-level detail for bottlenecks, with <2% overhead. 
Download for free and get started troubleshooting in minutes. 
http://pubads.g.doubleclick.net/gampad/clk?id=48897031&iu=/4140/ostg.clktrk___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #448 LOG: Incorrect validity check for saLogStreamFileName

2013-08-15 Thread elunlen
- Description has changed:

Diff:



--- old
+++ new
@@ -1,2 +1,2 @@
-Validity check when changing saLogStreamFileName via IMM adm interace is 
incorrect. Will always return information saying that the file name does not 
already exists.
+Validity check when changing saLogStreamFileName via IMM adm interace is 
incorrect. Will always return information saying that the file name does not 
already exists. The content of attribute saLogStreamFileName is used for 
checking not a complete path to any file.
 See file lgs_imm.c function check_attr_validity(..)






---

** [tickets:#448] LOG: Incorrect validity check for saLogStreamFileName**

**Status:** unassigned
**Created:** Mon Jun 10, 2013 12:01 PM UTC by elunlen
**Last Updated:** Mon Jun 10, 2013 12:01 PM UTC
**Owner:** elunlen

Validity check when changing saLogStreamFileName via IMM adm interace is 
incorrect. Will always return information saying that the file name does not 
already exists. The content of attribute saLogStreamFileName is used for 
checking not a complete path to any file.
See file lgs_imm.c function check_attr_validity(..)



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Get 100% visibility into Java/.NET code with AppDynamics Lite!
It's a free troubleshooting tool designed for production.
Get down to code-level detail for bottlenecks, with <2% overhead. 
Download for free and get started troubleshooting in minutes. 
http://pubads.g.doubleclick.net/gampad/clk?id=48897031&iu=/4140/ostg.clktrk___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #516 Amfd: calling immutil_saImmOiImplementerClear in avd_mds_qsd_role_evh leads to amfnd sending SIGABRT to amfd

2013-08-15 Thread Hans Feldt
Hans N and I made some progress on this one. Analysis:

Test case: cluster restart

* Due to the #540 bug the old active controller "hangs" in shutdown. The state 
here is probably that the RDE process is killed.

* The old standby controller is rebooted and comes up, RDE detects no peer an 
assumes ACTIVE role.

* When the above happens MDS informs the old active amfd process that it is now 
QUIESCED. From the MDS documentation:

"MDS_CALLBACK_QUIESCED_ACK

This callback informs the MDS Client that it is being moved to standby 
HA-state. The callback may be due to the result of the MDS Client switching to 
quiesced HA-state or due to a competing MDS-Client showing up in active 
HA-state."

the last part is what happens here.

* This fools the old active amfd into thinking a controller switch-over is 
going on. And it calls saImmOiImplementerClear(). This call eventually times 
out and abort() is called. The reason it fails is that the local immnd process 
(and immd) has already been killed.


This is basically a controller split brain. RDE should optimally fence the peer 
controller before going active. At least it could add some more evidence into 
the decision. There are already artifacts for RDE.

The proposed small change in amfd is to check that switch-over is pending in 
the avd_mds_qsd_role_evh() event handler. If not just log and exit. amfnd will 
detect that and reboot the node. No core dump generated.

Should even be easy to reproduce.



---

** [tickets:#516] Amfd: calling immutil_saImmOiImplementerClear in 
avd_mds_qsd_role_evh leads to amfnd sending SIGABRT to amfd**

**Status:** assigned
**Created:** Tue Jul 23, 2013 02:39 PM UTC by hano
**Last Updated:** Wed Aug 14, 2013 11:42 AM UTC
**Owner:** Praveen

osafamfd is "supervised" by osafamfnd through osafamfd is sending "heartbeats" 
to osafamfnd. If no "heartbeats" are recievied within one minute, osafamfnd 
will send an abort signal to osafamfd which then will abort, (produce an core 
dump and exit). The reason why osafamfd is not sending any "heartbeats" below 
is due to that osafamfd has got a role change message from MDS (Active to 
Quiesced) and calls immutil_saImmOiImplementerClear. IMM is not responding, 
osafamfd waits and is not sending any "heartbeats" and will be aborted by 
osafamfnd.

There are several cases with this behavior and amfd should not call 
immutil_saImmOiImplementerClear but instead call saImmOiImplementerClear and 
handle the return code and retry logic in avd_main_proc poll loop instead 
to avoid these core dumps and make amf responsive.

---

Core was generated by `/usr/lib64/opensaf/osafamfd'.
Program terminated with signal 6, Aborted.
 #0 0x7f08e45b6dfd in nanosleep () from /lib64/libc.so.6
(gdb) bt full
 #0 0x7f08e45b6dfd in nanosleep () from /lib64/libc.so.6
No symbol table info available.
 #1 0x7f08e45e2824 in usleep () from /lib64/libc.so.6
No symbol table info available.
 #2 0x00407506 in immutil_saImmOiImplementerClear 
(immOiHandle=94489411855) at ../../../../../osaf/tools/safimm/src/immutil.c:1042
rc = 
nTries = 54
 #3 0x0043492a in avd_mds_qsd_role_evh (cb=0x69c980, evt=) at avd_role.c:573
status = 
rc = 
 __FUNCTION__ = 
 #4 0x0043341d in avd_process_event (cb_now=0x69c980, evt=0x7ff160) at 
avd_proc.c:591
 __FUNCTION__ = 
 #5 0x004336a1 in avd_main_proc () at avd_proc.c:507
pollretval = 
cb = 0x69c980
evt = 0x7ff160
mbx_fd = 
error = 
polltmo = -1
 #6 0x004096bd in main (argc=, argv=) at 
amfd_main.c:47
error = 0
node_id = 
(gdb) quit



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Get 100% visibility into Java/.NET code with AppDynamics Lite!
It's a free troubleshooting tool designed for production.
Get down to code-level detail for bottlenecks, with <2% overhead. 
Download for free and get started troubleshooting in minutes. 
http://pubads.g.doubleclick.net/gampad/clk?id=48897031&iu=/4140/ostg.clktrk___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets