Hi HansN, I just tested with uniform buffer sizes in all nodes and sending messages with normal phase the results looks OK, even after hitting the TIPC_ERR_OVERLOAD.
So my conclusion is, in general all node will have same buffer sizes let us go with V2 patch, any how GA is tagged , so we have enough time for testing and if we get some issues we can resolve them by next release. ================================================================================================== Sep 21 11:51:40 SC-1 osafamfd[15792]: NO Node 'PL-4' joined the cluster Sep 21 11:51:40 SC-1 osafimmnd[15741]: NO Implementer connected: 17 (MsgQueueService132111) <0, 2040f> Sep 21 11:52:41 SC-1 osafimmd[15730]: 77 MDTM: undelivered message condition ancillary data size: 0 : TIPC_ERR_OVERLOAD Sep 21 11:52:41 SC-1 osafimmd[15730]: 7777 MDTM: undelivered message condition ancillary data: TIPC_RETDATA Sep 21 11:52:41 SC-1 osafimmd[15730]: 77 MDTM: undelivered message condition ancillary data size: 0 : TIPC_ERR_OVERLOAD Sep 21 11:52:41 SC-1 osafimmd[15730]: 7777 MDTM: undelivered message condition ancillary data: TIPC_RETDATA Sep 21 11:52:41 SC-1 osafimmd[15730]: 77 MDTM: undelivered message condition ancillary data size: 0 : TIPC_ERR_OVERLOAD Sep 21 11:52:41 SC-1 osafimmd[15730]: 7777 MDTM: undelivered message condition ancillary data: TIPC_RETDATA Sep 21 11:52:41 SC-1 osafimmd[15730]: 77 MDTM: undelivered message condition ancillary data size: 0 : TIPC_ERR_OVERLOAD Sep 21 11:52:41 SC-1 osafimmd[15730]: 7777 MDTM: undelivered message condition ancillary data: TIPC_RETDATA Sep 21 11:52:41 SC-1 osafimmd[15730]: 77 MDTM: undelivered message condition ancillary data size: 0 : TIPC_ERR_OVERLOAD Sep 21 11:52:41 SC-1 osafimmd[15730]: 7777 MDTM: undelivered message condition ancillary data: TIPC_RETDATA Sep 21 11:52:41 SC-1 osafimmd[15730]: 77 MDTM: undelivered message condition ancillary data size: 0 : TIPC_ERR_OVERLOAD Sep 21 11:52:41 SC-1 osafimmd[15730]: 7777 MDTM: undelivered message condition ancillary data: TIPC_RETDATA Sep 21 11:52:41 SC-1 osafimmd[15730]: 77 MDTM: undelivered message condition ancillary data size: 0 : TIPC_ERR_OVERLOAD Sep 21 11:52:41 SC-1 osafimmd[15730]: 7777 MDTM: undelivered message condition ancillary data: TIPC_RETDATA Sep 21 11:52:41 SC-1 osafimmd[15730]: 77 MDTM: undelivered message condition ancillary data size: 0 : TIPC_ERR_OVERLOAD Sep 21 11:52:41 SC-1 osafimmd[15730]: 7777 MDTM: undelivered message condition ancillary data: TIPC_RETDATA Sep 21 11:52:41 SC-1 osafimmd[15730]: 77 MDTM: undelivered message condition ancillary data size: 0 : TIPC_ERR_OVERLOAD Sep 21 11:52:41 SC-1 osafimmd[15730]: 7777 MDTM: undelivered message condition ancillary data: TIPC_RETDATA Sep 21 11:52:41 SC-1 osafimmd[15730]: 77 MDTM: undelivered message condition ancillary data size: 0 : TIPC_ERR_OVERLOAD Sep 21 11:52:41 SC-1 osafimmd[15730]: 7777 MDTM: undelivered message condition ancillary data: TIPC_RETDATA Sep 21 11:52:41 SC-1 osafimmd[15730]: 77 MDTM: undelivered message condition ancillary data size: 0 : TIPC_ERR_OVERLOAD Sep 21 11:52:41 SC-1 osafimmd[15730]: 7777 MDTM: undelivered message condition ancillary data: TIPC_RETDATA Sep 21 11:52:41 SC-1 osafimmd[15730]: 77 MDTM: undelivered message condition ancillary data size: 0 : TIPC_ERR_OVERLOAD Sep 21 11:52:41 SC-1 osafimmd[15730]: 7777 MDTM: undelivered message condition ancillary data: TIPC_RETDATA Sep 21 11:52:41 SC-1 osafimmd[15730]: 77 MDTM: undelivered message condition ancillary data size: 0 : TIPC_ERR_OVERLOAD Sep 21 11:52:41 SC-1 osafimmd[15730]: 7777 MDTM: undelivered message condition ancillary data: TIPC_RETDATA Sep 21 11:52:41 SC-1 osafimmd[15730]: 77 MDTM: undelivered message condition ancillary data size: 0 : TIPC_ERR_OVERLOAD Sep 21 11:52:41 SC-1 osafimmd[15730]: 7777 MDTM: undelivered message condition ancillary data: TIPC_RETDATA Sep 21 11:52:41 SC-1 osafimmd[15730]: 77 MDTM: undelivered message condition ancillary data size: 0 : TIPC_ERR_OVERLOAD Sep 21 11:52:41 SC-1 osafimmd[15730]: 7777 MDTM: undelivered message condition ancillary data: TIPC_RETDATA Sep 21 11:52:41 SC-1 osafimmd[15730]: 77 MDTM: undelivered message condition ancillary data size: 0 : TIPC_ERR_OVERLOAD Sep 21 11:52:41 SC-1 osafimmd[15730]: 7777 MDTM: undelivered message condition ancillary data: TIPC_RETDATA Sep 21 11:52:41 SC-1 osafimmd[15730]: 77 MDTM: undelivered message condition ancillary data size: 0 : TIPC_ERR_OVERLOAD Sep 21 11:52:41 SC-1 osafimmd[15730]: 7777 MDTM: undelivered message condition ancillary data: TIPC_RETDATA ================================================================================================== On 9/21/2016 11:37 AM, A V Mahesh wrote: > Hi HansN, > > On 9/20/2016 4:17 PM, Hans Nordebäck wrote: >> Hi Mahesh, >> >> I think only logging is needed as proposed in the patch, as some services >> are already handling dropped messages. This logging will help in >> trouble shooting. Keeping TIPC_DEST_DROPPABLE to true will only make TIPC to >> silently drop messages, the original problem persists and needs >> investigation, >> i.e. why the socket receive buffer is overloaded, one reason may be that the >> MDS poll/receive loop together with the "big" mutex lock, (ticket #520). > [AVM] One valid reason could be, in case of TIPC_ERR_OVERLOAD > recd_bytes is NOT zero , so buffer is overloaded can occur at TIPC or > MDS level , > I will investigate more and update. > >> Did you check why MDS message loss mechanism doesn't detect on TIPC dropped >> messages, AMF >> do detect this via e.g "out of sync", "msg id mismatch" and so on? > [AVM] You mean IMMD message loss mechanism ? > > -AVM >> /Regards HansN >> >> -----Original Message----- >> From: A V Mahesh [mailto:[email protected]] >> Sent: den 20 september 2016 12:29 >> To: Anders Widell <[email protected]>; Hans Nordebäck >> <[email protected]> >> Cc: [email protected]; [email protected] >> Subject: Re: [PATCH 1 of 1] MDS: Log TIPC dropped messages [#1957] >> >> HI Anders Widell / HansN, >> >> On 9/16/2016 2:03 PM, Anders Widell wrote: >>> The idea was to just log reception of error info messages, for >>> trouble-shooting purposes. >> After multiple attempts, i manged to simulate TIPC_ERR_OVERLOAD >> error. After TIPC_ERR_OVERLOAD error is hit >> the cluster going to UN-recoverable state , because the send buffers are >> full. >> >> So we have two options : >> >> 1) Set TIPC_DEST_DROPPABLE to false , log TIPC_ERR_OVERLOAD error and >> then graceful exist of sender, >> which allows remaining nodes to be survived. >> >> 2) keep the current configuration as it is ( TIPC_DEST_DROPPABLE to true ) >> >> ================================================================================================================= >> Sep 20 15:14:09 SC-1 osafamfd[3759]: NO Received node_up from 2040f: >> msg_id 1 >> Sep 20 15:14:09 SC-1 osafamfd[3759]: NO Node 'PL-4' joined the cluster Sep >> 20 15:14:09 SC-1 osafimmnd[3695]: NO Implementer connected: 19 >> (MsgQueueService132111) <0, 2040f> >> *Sep 20 15:16:59 SC-1 osafimmd[3684]: 77 MDTM: undelivered message condition >> ancillary data: TIPC_ERR_OVERLOAD* Sep 20 15:17:00 SC-1 osafimmnd[3695]: WA >> Director Service in NOACTIVE state - fevs replies pending:1 fevs highest >> processed:218744 Sep 20 15:17:00 SC-1 osafamfnd[3773]: NO >> 'safComp=IMMD,safSu=SC-1,safSg=2N,safApp=OpenSAF' faulted due to 'avaDown' : >> Recovery is 'nodeFailfast' >> Sep 20 15:17:00 SC-1 osafamfnd[3773]: ER >> safComp=IMMD,safSu=SC-1,safSg=2N,safApp=OpenSAF Faulted due to:avaDown >> Recovery is:nodeFailfast Sep 20 15:17:00 SC-1 osafamfnd[3773]: Rebooting >> OpenSAF NodeId = 131343 EE Name = , Reason: Component faulted: recovery is >> node failfast, OwnNodeId = 131343, SupervisionTime = 60 Sep 20 15:17:00 SC-1 >> osafimmnd[3695]: WA DISCARD DUPLICATE FEVS >> message:218744 >> Sep 20 15:17:00 SC-1 osafimmnd[3695]: WA Error code 2 returned for message >> type 82 - ignoring Sep 20 15:17:00 SC-1 opensaf_reboot: Rebooting local >> node; timeout=60 Sep 20 15:17:00 SC-1 osafimmnd[3695]: WA SC Absence IS >> allowed:900 IMMD service is DOWN Sep 20 15:17:00 SC-1 osafimmnd[3695]: NO >> IMMD SERVICE IS DOWN, HYDRA IS CONFIGURED => UNREGISTERING IMMND form MDS >> Sep 20 15:17:00 SC-1 osafntfimcnd[3742]: NO saImmOiDispatch() Fail >> SA_AIS_ERR_BAD_HANDLE (9) Sep 20 15:17:00 SC-1 osafimmnd[3695]: NO Removing >> client id:20002010f >> sv_id:27 >> Sep 20 15:17:00 SC-1 osafimmnd[3695]: NO Implementer disconnected 1 <2, >> 2010f> (safLogService) >> Sep 20 15:17:00 SC-1 osafimmnd[3695]: NO Removing client id:d0d0002010f >> sv_id:26 >> Sep 20 15:17:00 SC-1 osafimmnd[3695]: NO Removing client id:100002010f >> sv_id:27 >> Sep 20 15:17:00 SC-1 osafimmnd[3695]: NO Implementer disconnected 2 <16, >> 2010f> (@safLogService_appl) >> Sep 20 15:17:00 SC-1 osafimmnd[3695]: NO Removing client id:130002010f >> sv_id:27 >> Sep 20 15:17:00 SC-1 osafimmnd[3695]: NO Implementer disconnected 3 <19, >> 2010f> (@OpenSafImmReplicatorA) >> Sep 20 15:17:00 SC-1 osafimmnd[3695]: NO Removing client id:140002010f >> sv_id:26 >> Sep 20 15:17:00 SC-1 osafimmnd[3695]: NO Removing client id:150002010f >> sv_id:27 >> Sep 20 15:17:00 SC-1 osafimmnd[3695]: NO Implementer disconnected 4 <21, >> 2010f> (safClmService) >> Sep 20 15:17:00 SC-1 osafimmnd[3695]: NO Removing client id:1a0002010f >> sv_id:27 >> Sep 20 15:17:00 SC-1 osafimmnd[3695]: NO Implementer disconnected 5 <26, >> 2010f> (safAmfService) >> Sep 20 15:17:00 SC-1 osafimmnd[3695]: NO Removing client id:1b0002010f >> sv_id:26 >> Sep 20 15:17:00 SC-1 osafimmnd[3695]: NO Removing client id:5bc0002010f >> sv_id:26 >> Sep 20 15:17:00 SC-1 osafimmnd[3695]: NO Removing client id:5bd0002010f >> sv_id:27 >> Sep 20 15:17:00 SC-1 osafimmnd[3695]: NO Implementer disconnected 6 <1469, >> 2010f> (MsgQueueService131343) Sep 20 15:17:00 SC-1 osafimmnd[3695]: NO >> Removing client id:5c00002010f >> sv_id:27 >> Sep 20 15:17:00 SC-1 osafimmnd[3695]: NO Implementer disconnected 10 <1472, >> 2010f> (safEvtService) Sep 20 15:17:00 SC-1 osafimmnd[3695]: NO Removing >> client id:5c40002010f >> sv_id:27 >> Sep 20 15:17:00 SC-1 osafimmnd[3695]: NO Implementer disconnected 8 <1476, >> 2010f> (safSmfService) Sep 20 15:17:00 SC-1 osafimmnd[3695]: NO Removing >> client id:5c60002010f >> sv_id:27 >> Sep 20 15:17:00 SC-1 osafimmnd[3695]: NO Implementer disconnected 9 <1478, >> 2010f> (safLckService) Sep 20 15:17:00 SC-1 osafimmnd[3695]: NO Removing >> client id:5c70002010f >> sv_id:27 >> Sep 20 15:17:00 SC-1 osafimmnd[3695]: NO Implementer disconnected 7 <1479, >> 2010f> (safMsgGrpService) Sep 20 15:17:00 SC-1 osafimmnd[3695]: NO Removing >> client id:5cc0002010f >> sv_id:27 >> Sep 20 15:17:00 SC-1 osafimmnd[3695]: NO Removing client id:5ce0002010f >> sv_id:27 >> Sep 20 15:17:00 SC-1 osafimmnd[3695]: NO Implementer disconnected 12 <1486, >> 2010f> (safCheckPointService) Sep 20 15:17:00 SC-1 osafimmnd[3695]: NO >> Implementer disconnected 13 <0, 2020f(down)> (MsgQueueService131599) Sep 20 >> 15:17:00 SC-1 osafimmnd[3695]: NO Implementer disconnected 14 <0, >> 2020f(down)> (@OpenSafImmReplicatorB) Sep 20 15:17:00 SC-1 osafimmnd[3695]: >> NO Implementer disconnected 15 <0, 2020f(down)> (@safAmfService2020f) Sep 20 >> 15:17:00 SC-1 osafimmnd[3695]: NO Impl Discarded node 2020f Sep 20 15:17:00 >> SC-1 osafimmnd[3695]: NO Implementer disconnected 16 <0, 2030f(down)> >> (MsgQueueService131855) Sep 20 15:17:00 SC-1 osafimmnd[3695]: NO Impl >> Discarded node 2030f Sep 20 15:17:00 SC-1 osafimmnd[3695]: NO Implementer >> disconnected 19 <0, 2040f(down)> (MsgQueueService132111) Sep 20 15:17:00 >> SC-1 osafimmnd[3695]: NO Impl Discarded node 2040f Sep 20 15:17:00 SC-1 >> osafimmnd[3695]: NO MDS unregisterede. sleeping ... >> Sep 20 15:17:01 SC-1 osafimmnd[3695]: NO Sleep done registering IMMND with >> MDS Sep 20 15:17:01 SC-1 osafimmnd[3695]: NO MDS: mds_register_callback: >> dest 2010fe8fa0043 already exist >> Sep 20 15:17:01 SC-1 osafimmnd[3695]: NO MDS: mds_register_callback: >> dest 2010fdcb60040 already exist >> Sep 20 15:17:01 SC-1 osafimmnd[3695]: NO MDS: mds_register_callback: >> dest 2010fdcb6002e already exist >> Sep 20 15:17:01 SC-1 osafimmnd[3695]: NO MDS: mds_register_callback: >> dest 2010fdcb60037 already exist >> Sep 20 15:17:01 SC-1 osafimmnd[3695]: NO MDS: mds_register_callback: >> dest 2010fdcb60028 already exist >> Sep 20 15:17:01 SC-1 osafimmnd[3695]: NO MDS: mds_register_callback: >> dest 2010fdcb6003d already exist >> Sep 20 15:17:01 SC-1 osafimmnd[3695]: NO MDS: mds_register_callback: >> dest 2010fdcb6002b already exist >> Sep 20 15:17:01 SC-1 osafimmnd[3695]: NO MDS: mds_register_callback: >> dest 2010fdcb6001c already exist >> Sep 20 15:17:01 SC-1 osafimmnd[3695]: NO MDS: mds_register_callback: >> dest 2010fdcb60019 already exist >> Sep 20 15:17:01 SC-1 osafimmnd[3695]: NO MDS: mds_register_callback: >> dest 2010fdcba0012 already exist >> Sep 20 15:17:01 SC-1 osafimmnd[3695]: NO MDS: mds_register_callback: >> dest 2010fdcb60028 already exist >> Sep 20 15:17:01 SC-1 osafimmnd[3695]: NO MDS: mds_register_callback: >> dest 2010fdcb60019 already exist >> Sep 20 15:17:01 SC-1 osafimmnd[3695]: NO SUCCESS IN REGISTERING IMMND WITH >> MDS Sep 20 15:17:01 SC-1 osafimmnd[3695]: NO Re-introduce-me >> highestProcessed:218744 highestReceived:218744 Sep 20 15:17:03 SC-1 kernel: >> [ 1794.198381] md: stopping all md devices. >> Sep 20 15:17:03 SC-1 osafntfimcnd[8997]: WA ntfimcn_imm_init >> saImmOiInitialize_2() returned SA_AIS_ERR_TIMEOUT (5) Sep 20 15:18:00 SC-1 >> syslog-ng[1221]: syslog-ng starting up; version='2.0.9' >> ================================================================================================================= >> >> -AVM >> >> On 9/16/2016 2:03 PM, Anders Widell wrote: >>> I don't think we need (or even should) inform the sender when MDS >>> receives an error information message from TIPC. Note that these error >>> information messages are received asynchronously, when the sender has >>> already received an OK return code from the MDS send call. The idea >>> was to just log reception of error info messages, for trouble-shooting >>> purposes. We already have a mechanism in MDS that informs the receiver >>> about lost MDS messages. If we wish to inform the sender we would need >>> to introduce a second mechanism in MDS, and at this point I don't >>> think it is needed. Another approach we could consider is that MDS >>> retransmits the message transparently without informing the sender. >>> This would require MDS to internally store sent messages for a while, >>> so that they can be retransmitted. It would also require the receiver >>> to re-order received messages, since a retransmitted message will be >>> received out of sequence. >>> >>> regards, >>> >>> Anders Widell >>> >>> >>> On 09/16/2016 06:40 AM, A V Mahesh wrote: >>>> Hi HansN, >>>> >>>> I managed to create TIPC_ERRINFO/TIPC_RETDATA error cases ( not >>>> TIPC_ERR_OVERLOAD error ) with normal messages and It is observed >>>> that TIPC_DEST_DROPPABLE set to true even error TIPC_ERRINFO is NOT >>>> notified ( it means TIPC_ERR_OVERLOAD ) , if TIPC_DEST_DROPPABLE set >>>> to false TIPC_ERRINFO/TIPC_RETDATA errors are notified. >>>> >>>> Now I will also check implication of TIPC_DEST_DROPPABLE set to false >>>> on multicast and broadcast messages, based on that we can re-arrange >>>> the TIPC_DEST_DROPPABLE setting to false conditions based on agent >>>> `i_msg_loss_indication = true` condition mds can return to agent the >>>> same error TIPC_ERR_OVERLOAD. >>>> >>>> TIPC_DEST_DROPPABLE to false: >>>> >>>> ================================================================== >>>> >>>> Sep 15 16:10:39 SC-1 osafimmnd[32051]: NO Implementer disconnected 13 >>>> <0, 2040f> (MsgQueueService132111) Sep 15 16:10:39 SC-1 >>>> osafimmd[32040]: 777 MDTM: undelivered message condition ancillary >>>> data: TIPC_ERRINFO abort err : 2 Sep 15 16:10:39 SC-1 >>>> osafimmd[32040]: 7777 MDTM: undelivered message condition ancillary >>>> data: TIPC_RETDATA Sep 15 16:10:39 SC-1 osafimmd[32040]: NO MDS event >>>> from svc_id 25 (change:4, dest:567413369208836) Sep 15 16:10:39 SC-1 >>>> osafimmd[32040]: 777 MDTM: undelivered message condition ancillary >>>> data: TIPC_ERRINFO abort err : 2 Sep 15 16:10:39 SC-1 >>>> osafimmd[32040]: 7777 MDTM: undelivered message condition ancillary >>>> data: TIPC_RETDATA Sep 15 16:10:39 SC-1 osafimmd[32040]: 777 MDTM: >>>> undelivered message condition ancillary data: TIPC_ERRINFO abort err >>>> : 2 Sep 15 16:10:39 SC-1 osafimmd[32040]: 7777 MDTM: undelivered >>>> message condition ancillary data: TIPC_RETDATA Sep 15 16:10:39 SC-1 >>>> osafimmd[32040]: 777 MDTM: undelivered message condition ancillary >>>> data: TIPC_ERRINFO abort err : 2 Sep 15 16:10:39 SC-1 >>>> osafimmd[32040]: 7777 MDTM: undelivered message condition ancillary >>>> data: TIPC_RETDATA Sep 15 16:10:39 SC-1 osafimmd[32040]: 777 MDTM: >>>> undelivered message condition ancillary data: TIPC_ERRINFO abort err >>>> : 2 Sep 15 16:10:39 SC-1 osafimmd[32040]: 7777 MDTM: undelivered >>>> message condition ancillary data: TIPC_RETDATA Sep 15 16:10:39 SC-1 >>>> osafimmd[32040]: 777 MDTM: undelivered message condition ancillary >>>> data: TIPC_ERRINFO abort err : 2 Sep 15 16:10:39 SC-1 >>>> osafimmd[32040]: 7777 MDTM: undelivered message condition ancillary >>>> data: TIPC_RETDATA Sep 15 16:10:39 SC-1 osafimmd[32040]: 777 MDTM: >>>> undelivered message condition ancillary data: TIPC_ERRINFO abort err >>>> : 2 Sep 15 16:10:39 SC-1 osafimmd[32040]: 7777 MDTM: undelivered >>>> message condition ancillary data: TIPC_RETDATA Sep 15 16:10:39 SC-1 >>>> osafimmd[32040]: 777 MDTM: undelivered message condition ancillary >>>> data: TIPC_ERRINFO abort err : 2 Sep 15 16:10:39 SC-1 >>>> osafimmd[32040]: 7777 MDTM: undelivered message condition ancillary >>>> data: TIPC_RETDATA Sep 15 16:10:39 SC-1 osafimmd[32040]: 777 MDTM: >>>> undelivered message condition ancillary data: TIPC_ERRINFO abort err >>>> : 2 Sep 15 16:10:39 SC-1 osafimmd[32040]: 7777 MDTM: undelivered >>>> message condition ancillary data: TIPC_RETDATA Sep 15 16:10:39 SC-1 >>>> osafimmd[32040]: 777 MDTM: undelivered message condition ancillary >>>> data: TIPC_ERRINFO abort err : 2 Sep 15 16:10:39 SC-1 >>>> osafimmd[32040]: 7777 MDTM: undelivered message condition ancillary >>>> data: TIPC_RETDATA Sep 15 16:10:39 SC-1 osafamfd[32114]: NO Node >>>> 'PL-4' left the cluster >>>> >>>> ================================================================== >>>> >>>> TIPC_DEST_DROPPABLE to true: >>>> >>>> ================================================================== >>>> >>>> Sep 15 15:59:55 SC-1 osafimmnd[26461]: NO Implementer disconnected 13 >>>> <0, 2040f> (MsgQueueService132111) Sep 15 15:59:55 SC-1 >>>> osafimmd[26450]: NO MDS event from svc_id 25 (change:4, >>>> dest:567412923957252) Sep 15 15:59:55 SC-1 osafimmnd[26461]: NO >>>> Global discard node received for nodeId:2040f pid:410 Sep 15 15:59:55 >>>> SC-1 osafamfd[28810]: NO Node 'PL-4' left the cluster Sep 15 15:59:58 >>>> SC-1 kernel: [ 5147.648737] tipc: Resetting link >>>> <1.1.1:eth0-1.1.4:eth0>, peer not responding Sep 15 15:59:58 SC-1 >>>> kernel: [ 5147.648756] tipc: Lost link <1.1.1:eth0-1.1.4:eth0> on >>>> network plane A Sep 15 15:59:58 SC-1 kernel: [ 5147.648771] tipc: >>>> Lost contact with <1.1.4> >>>> >>>> ================================================================== >>>> >>>> -AVM >>>> >>>> >>>> On 9/1/2016 10:59 AM, Hans Nordebäck wrote: >>>>> Hi Mahesh, >>>>> >>>>> I have not tested this, but the following should work: >>>>> >>>>> - Set BSRsock TIPC_IMPORTANCE to TIPC_LOW_IMPORTANCE >>>>> >>>>> - set socket receive buffer to a small value: >>>>> >>>>> optval = "small socket recieive buffer size" , 5000 ? >>>>> >>>>> setsockopt(tipc_cb.BSRsock, SOL_SOCKET, SO_RCVBUF, &optval, >>>>> optlen) >>>>> >>>>> - sysctl -w net.tipc.tipc_rmem="5000 40000000 68240400" (or smaller >>>>> values) >>>>> >>>>> - add some delays when processing messages in >>>>> mdtm_process_recv_events(), to provoke overloading the socket >>>>> receive buffer. >>>>> >>>>> We experience dropped packages in a 75 node system, and as a >>>>> workaround increasing the default so receive buffer size it seems >>>>> working for that setup. >>>>> >>>>> /Thanks HansN >>>>> >>>>> On 09/01/2016 05:50 AM, A V Mahesh wrote: >>>>>> Hi HansN, >>>>>> >>>>>> Do you have any tips to created overload case, >>>>>> >>>>>> I would like test and observe TIPC_DEST_DROPPABLE enabled & >>>>>> disabled cases. >>>>>> >>>>>> -AVM >>>>>> >>>>>> >>>>>> On 9/1/2016 9:12 AM, A V Mahesh wrote: >>>>>>> Hi HansN, >>>>>>> >>>>>>> Sorry for the delay. >>>>>>> >>>>>>> I will test it and get back to you soon. >>>>>>> >>>>>>> -AVM >>>>>>> >>>>>>> >>>>>>> On 8/31/2016 4:29 PM, Hans Nordebäck wrote: >>>>>>>> Hi Mahesh, >>>>>>>> Any updates on this? >>>>>>>> >>>>>>>> /Regards HansN >>>>>>>> >>>>>>>> -----Original Message----- >>>>>>>> From: Anders Widell >>>>>>>> Sent: den 25 augusti 2016 13:11 >>>>>>>> To: A V Mahesh <[email protected]>; Hans Nordebäck >>>>>>>> <[email protected]>; [email protected] >>>>>>>> Cc: [email protected] >>>>>>>> Subject: Re: [PATCH 1 of 1] MDS: Log TIPC dropped messages >>>>>>>> [#1957] >>>>>>>> >>>>>>>> Hi! >>>>>>>> >>>>>>>> This is what the TIPC user documentation says about >>>>>>>> TIPC_DEST_DROPPABLE: >>>>>>>> "This option governs the handling of messages sent by the socket >>>>>>>> if the message cannot be delivered to its destination, either >>>>>>>> because the receiver is congested or because the specified >>>>>>>> receiver does not exist. >>>>>>>> If enabled, the message is discarded; otherwise the message is >>>>>>>> returned to the sender." >>>>>>>> >>>>>>>> This is what the TIPC user documentation says about the return >>>>>>>> value from the recvmsg() system call: "When used with a >>>>>>>> connectionless socket, a return value of 0 indicates the arrival >>>>>>>> of a returned data message that was originally sent by this socket." >>>>>>>> >>>>>>>> I think the documentation is pretty clear. If you set >>>>>>>> TIPC_DEST_DROPPABLE to true, the receiver can discard messages >>>>>>>> e.g. when the receive buffer is full. The sender will not be >>>>>>>> notified in this case. If TIPC_DEST_DROPPABLE is set to false, >>>>>>>> the message will be returned to the sender in case of a full >>>>>>>> receive buffer. The sender knows that it has received such a >>>>>>>> returned message when the recvmsg() call returns zero. >>>>>>>> >>>>>>>> regards, >>>>>>>> Anders Widell >>>>>>>> >>>>>>>> On 08/25/2016 11:30 AM, A V Mahesh wrote: >>>>>>>>> Hi HansN, >>>>>>>>> >>>>>>>>> >>>>>>>>> On 8/23/2016 5:22 PM, Hans Nordebäck wrote: >>>>>>>>> >>>>>>>>>> Hi Mahesh, >>>>>>>>>> >>>>>>>>>> Yes, this is my understanding too, if TIPC_DROPPABLE = true >>>>>>>>>> tipc may drop messages silently, at receive sock buffer full >>>>>>>>>> condition, but do not return any ancillary message. >>>>>>>>>> If TIPC_DROPPABLE = false tipc may drop message but will send >>>>>>>>>> an ancillary message to inform about TIPC_ERR_OVERLOAD. >>>>>>>>> [AVM] >>>>>>>>> >>>>>>>>> My observation are understanding is different, based on TIPC >>>>>>>>> code and Linux TIPC 2.0 Programmer's Guide , that the >>>>>>>>> TIPC_ERR_OVERLOAD error returned when TIPC is unable to enqueue >>>>>>>>> an incoming message on the receiving socket's receive queue >>>>>>>>> irrelevant of TIPC_DEST_DROPPABLE enabled or disabled. >>>>>>>>> >>>>>>>>> The only difference between TIPC_DEST_DROPPABLE enabled or >>>>>>>>> disabled is , If TIPC_DEST_DROPPABLE enabled, the message is >>>>>>>>> discarded and >>>>>>>>> recvmsg() returned size is ZERO and application will get errors, >>>>>>>>> if TIPC_DEST_DROPPABLE disabled the message is returned to the >>>>>>>>> sender it means the recvmsg() returned size is user send data >>>>>>>>> size and application will get errors . >>>>>>>>> >>>>>>>>> I did check the TIPC code and documentations and I haven't get >>>>>>>>> any evidences that TIPC_ERR_OVERLOAD error code will be send >>>>>>>>> only If TIPC_DEST_DROPPABLE = false. >>>>>>>>> >>>>>>>>> Even while testing #1227 >>>>>>>>> (https://sourceforge.net/p/opensaf/mailman/message/33207717/) my >>>>>>>>> observations and understanding was, an individual TIPC socket is >>>>>>>>> only allowed to queue up >>>>>>>>> OVERLOAD_LIMIT_BASE/2 messages of the lowest importance level >>>>>>>>> before it starts rejecting them. >>>>>>>>> Once a socket receiving queue length exceeds the maximum limit >>>>>>>>> value, the receiving socket will send out a reject message with >>>>>>>>> TIPC_ERR_OVERLOAD error code with cmsg_type as >>>>>>>>> TIPC_ERRINFO/TIPC_RETDATA, and the tipc code and Linux TIPC 2.0 >>>>>>>>> Programmer's Guide confirmed the same . >>>>>>>>> >>>>>>>>> tipc/socket.c >>>>>>>>> ======================================================= >>>>>>>>> /* Reject message if there isn't room to queue it */ >>>>>>>>> >>>>>>>>> recv_q_len = (u32)atomic_read(&tipc_queue_size); >>>>>>>>> if (unlikely(recv_q_len >= OVERLOAD_LIMIT_BASE)) { >>>>>>>>> if (rx_queue_full(msg, recv_q_len, OVERLOAD_LIMIT_BASE)) >>>>>>>>> return TIPC_ERR_OVERLOAD; } recv_q_len = >>>>>>>>> skb_queue_len(&sk->sk_receive_queue); >>>>>>>>> if (unlikely(recv_q_len >= (OVERLOAD_LIMIT_BASE / 2))) { >>>>>>>>> if (rx_queue_full(msg, recv_q_len, OVERLOAD_LIMIT_BASE / 2)) >>>>>>>>> return TIPC_ERR_OVERLOAD; } >>>>>>>>> ======================================================= >>>>>>>>> >>>>>>>>> >>>>>>>>> 2.1.17. setsockopt() of TIPC 2.0 Programmer's Guide >>>>>>>>> ======================================================= >>>>>>>>> TIPC_DEST_DROPPABLE >>>>>>>>> This option governs the handling of messages sent by the socket >>>>>>>>> if the message cannot be delivered to its destination, either >>>>>>>>> because the receiver is congested or because the specified >>>>>>>>> receiver does not exist. If enabled, the message is discarded; >>>>>>>>> otherwise the message is returned to the sender. >>>>>>>>> >>>>>>>>> By default, this option is disabled for SOCK_SEQPACKET and >>>>>>>>> SOCK_STREAM socket types, and enabled for SOCK_RDM and >>>>>>>>> SOCK_DGRAM, This arrangement ensures proper teardown of failed >>>>>>>>> connections when connection-oriented data transfer is used, >>>>>>>>> without increasing the complexity of connectionless data >>>>>>>>> transfer. >>>>>>>>> >>>>>>>>> TIPC_SRC_DROPPABLE >>>>>>>>> This option governs the handling of messages sent by the socket >>>>>>>>> if link congestion occurs. If enabled, the message is discarded; >>>>>>>>> otherwise the system queues the message for later transmission. >>>>>>>>> By default, this option is disabled for SOCK_SEQPACKET, >>>>>>>>> SOCK_STREAM, and SOCK_RDM socket types (resulting in "reliable" >>>>>>>>> data transfer), and enabled for SOCK_DGRAM (resulting in >>>>>>>>> "unreliable" data transfer). >>>>>>>>> ======================================================= >>>>>>>>> >>>>>>>>> Now I will try to create OVERLOAD case and update you soon my >>>>>>>>> latest observations. >>>>>>>>> >>>>>>>>> -AVM >>>>>>>>> >>>>>>>>>> Correcting this and adding an abort is not backward compatible >>>>>>>>>> as some service already handle flow control in some way, only >>>>>>>>>> log when packages are dropped. >>>>>>>>>> Regarding ticket #1960 there are other solutions than >>>>>>>>>> introducing flow control in MDS, e.g. expose an option to the >>>>>>>>>> service to choose connection oriented or connection less. >>>>>>>>>> The problem with dropped messages seems in one case related to, >>>>>>>>>> (by MDS), intensive MDS logging. >>>>>>>>>> >>>>>>>>>> /Thanks HansN >>>>>>>>>> -----Original Message----- >>>>>>>>>> From: A V Mahesh [mailto:[email protected]] >>>>>>>>>> Sent: den 23 augusti 2016 11:27 >>>>>>>>>> To: Hans Nordebäck <[email protected]>; Anders Widell >>>>>>>>>> <[email protected]>; [email protected] >>>>>>>>>> Cc: [email protected] >>>>>>>>>> Subject: Re: [PATCH 1 of 1] MDS: Log TIPC dropped messages >>>>>>>>>> [#1957] >>>>>>>>>> >>>>>>>>>> Hi HansN, >>>>>>>>>> >>>>>>>>>> It seems I am missing some thing , please allow me to under >>>>>>>>>> stand >>>>>>>>>> >>>>>>>>>> If I currently understand you observation : >>>>>>>>>> >>>>>>>>>> With current Opensaf code ( this #1957 patch NOT applied ) , by >>>>>>>>>> default TIPC_DROPPABLE=true ,while running Opensaf with that >>>>>>>>>> binary when TIPC_ERR_OVERLOAD occurring, TIPC is not given >>>>>>>>>> errors TIPC_ERRINFO or TIPC_RETDATA and following code is not >>>>>>>>>> being get hit of function recvfrom_connectionless(), is my >>>>>>>>>> understanding right ? >>>>>>>>>> >>>>>>>>>> =============================================================== >>>>>>>>>> ====== >>>>>>>>>> >>>>>>>>>> ======================================== >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> *if (anc->cmsg_type == TIPC_ERRINFO) {* >>>>>>>>>> /* TIPC_ERRINFO - TIPC error code associated with a >>>>>>>>>> returned data message or a connection termination message so >>>>>>>>>> abort */ >>>>>>>>>> m_MDS_LOG_CRITICAL("MDTM: undelivered message condition >>>>>>>>>> ancillary >>>>>>>>>> data: TIPC_ERRINFO abort err :%s", strerror(errno) ); >>>>>>>>>> *abort();* >>>>>>>>>> *} else if (anc->cmsg_type == TIPC_RETDATA) {* >>>>>>>>>> /* If we set TIPC_DEST_DROPPABLE off messge (configure >>>>>>>>>> TIPC to return rejected messages to the sender ) >>>>>>>>>> we will hit this when we implement MDS retransmit >>>>>>>>>> lost messages abort can be replaced with flow control logic*/ >>>>>>>>>> for (i = anc->cmsg_len - sizeof(*anc); i > 0; i--) { >>>>>>>>>> m_MDS_LOG_DBG("MDTM: returned byte 0x%02x\n", *cptr); >>>>>>>>>> cptr++; >>>>>>>>>> } >>>>>>>>>> /* TIPC_RETDATA -The contents of a returned data message >>>>>>>>>> so abort */ >>>>>>>>>> m_MDS_LOG_CRITICAL("MDTM: undelivered message condition >>>>>>>>>> ancillary >>>>>>>>>> data: TIPC_RETDATA abort err :%s", strerror(errno) ); >>>>>>>>>> *abort();* >>>>>>>>>> } >>>>>>>>>> >>>>>>>>>> =============================================================== >>>>>>>>>> ====== >>>>>>>>>> >>>>>>>>>> ======================================== >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -AVM >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 8/23/2016 1:08 PM, Hans Nordebäck wrote: >>>>>>>>>>> Hi Mahesh, >>>>>>>>>>> >>>>>>>>>>> Please see response below with [HansN] /Thanks HansN >>>>>>>>>>> >>>>>>>>>>> -----Original Message----- >>>>>>>>>>> From: A V Mahesh [mailto:[email protected]] >>>>>>>>>>> Sent: den 23 augusti 2016 08:25 >>>>>>>>>>> To: Hans Nordebäck <[email protected]>; Anders >>>>>>>>>>> Widell <[email protected]>; [email protected] >>>>>>>>>>> Cc: [email protected] >>>>>>>>>>> Subject: Re: [PATCH 1 of 1] MDS: Log TIPC dropped messages >>>>>>>>>>> [#1957] >>>>>>>>>>> >>>>>>>>>>> Hi HansN >>>>>>>>>>> >>>>>>>>>>> Please see response below with [AVM] >>>>>>>>>>> >>>>>>>>>>> -AVM >>>>>>>>>>> >>>>>>>>>>> On 8/23/2016 11:41 AM, Hans Nordebäck wrote: >>>>>>>>>>>> Hi Mahesh, >>>>>>>>>>>> >>>>>>>>>>>> please see comments below. >>>>>>>>>>>> >>>>>>>>>>>> /Thanks HansN >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On 08/23/2016 07:21 AM, A V Mahesh wrote: >>>>>>>>>>>>> Hi HansN, >>>>>>>>>>>>> >>>>>>>>>>>>> Let us fist discuss the error handling and abort, then we >>>>>>>>>>>>> can come back to interpretation of TIPC currently does >>>>>>>>>>>>> permit OR does not permit an application to send a >>>>>>>>>>>>> multicast message with the "destination droppable" setting >>>>>>>>>>>>> disabled. >>>>>>>>>>>>> >>>>>>>>>>>>> Let us disable TIPC_DEST_DROPPABLE, so that TIPC will try to >>>>>>>>>>>>> return an undelivered multicast message to its sender and we >>>>>>>>>>>>> can determine issue is because of TIPC_ERR_OVERLOAD, this >>>>>>>>>>>>> helps in debugging , so that application may increased >>>>>>>>>>>>> SO_SNDBUF/SO_RCVBUF to reduce the problem. >>>>>>>>>>>>> >>>>>>>>>>>>> But still we need to abort(), the reason for that is current >>>>>>>>>>>>> MDS implementations doesn't have flow control logic ( no >>>>>>>>>>>>> retry because of error ) , so Application like AMF can go >>>>>>>>>>>>> wrong and cluster will go into unstable/recoverble state. >>>>>>>>>>>>> >>>>>>>>>>>> [HansN] In the current implementation messages are dropped >>>>>>>>>>>> silently and no abort is done. >>>>>>>>>>> [AVM] I can see abort(); in current code , you mean abort(); >>>>>>>>>>> is not working and application(amf) is not existing ? >>>>>>>>>>> [HansN] In case of TIPC_DROPPABLE=true and messages are >>>>>>>>>>> dropped, >>>>>>>>>>> (TIPC_ERR_OVERLOAD) no abort is be performed, e.g amfd >>>>>>>>>>> detects this in the msg sanity chk and logs "invalid msg id >>>>>>>>>>> ..." >>>>>>>>>>> ============================================================== >>>>>>>>>>> ====== >>>>>>>>>>> >>>>>>>>>>> == >>>>>>>>>>> ====== >>>>>>>>>>> if (anc->cmsg_type == TIPC_ERRINFO) { >>>>>>>>>>> /* TIPC_ERRINFO - TIPC error code associated with a >>>>>>>>>>> returned data message or a connection termination message so >>>>>>>>>>> abort */ >>>>>>>>>>> m_MDS_LOG_CRITICAL("MDTM: undelivered message >>>>>>>>>>> condition ancillary >>>>>>>>>>> data: TIPC_ERRINFO abort err :%s", strerror(errno) ); >>>>>>>>>>> *abort();* >>>>>>>>>>> } else if (anc->cmsg_type == TIPC_RETDATA) { >>>>>>>>>>> /* If we set TIPC_DEST_DROPPABLE off messge (configure >>>>>>>>>>> TIPC to return rejected messages to the sender ) >>>>>>>>>>> we will hit this when we implement MDS retransmit >>>>>>>>>>> lost messages abort can be replaced with flow control logic*/ >>>>>>>>>>> for (i = anc->cmsg_len - sizeof(*anc); i > 0; i--) { >>>>>>>>>>> m_MDS_LOG_DBG("MDTM: returned byte 0x%02x\n", *cptr); >>>>>>>>>>> cptr++; >>>>>>>>>>> } >>>>>>>>>>> /* TIPC_RETDATA -The contents of a returned data >>>>>>>>>>> message so abort */ >>>>>>>>>>> m_MDS_LOG_CRITICAL("MDTM: undelivered message >>>>>>>>>>> condition ancillary >>>>>>>>>>> data: TIPC_RETDATA abort err :%s", strerror(errno) ); >>>>>>>>>>> *abort();* >>>>>>>>>>> } >>>>>>>>>>> ============================================================== >>>>>>>>>>> ====== >>>>>>>>>>> >>>>>>>>>>> == >>>>>>>>>>> ====== >>>>>>>>>>>> This patch enables logging >>>>>>>>>>>> when packages are dropped to help in debugging. I don't agree >>>>>>>>>>>> that we should also introduce abort, but instead: >>>>>>>>>>>> 1) Implement a solution to handle dropped packages, ticket >>>>>>>>>>>> #1960 >>>>>>>>>>> [AVM] This is nothing but flow control implementation in MDS, >>>>>>>>>>> this is future enhancement >>>>>>>>>>> >>>>>>>>>>>> 2) Investigate why packages may be dropped, the receiving MDS >>>>>>>>>>>> thread is a real time thread and should be able to consume a >>>>>>>>>>>> large amount of incoming messages. >>>>>>>>>>>> E.g. is the receiving MDS thread "live hanging" due to locks, >>>>>>>>>>>> file I/O etc? >>>>>>>>>>>>> This was the reason we haven't gone for it while addressing >>>>>>>>>>>>> Ticket >>>>>>>>>>>>> #1227 >>>>>>>>>>>>> (https://sourceforge.net/p/opensaf/mailman/message/33207717/ >>>>>>>>>>>>> ) So currently we don't have any advantage of disabling >>>>>>>>>>>>> TIPC_DEST_DROPPABLE and not allowing multicast messages. >>>>>>>>>>>>> >>>>>>>>>>>>> -AVM >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On 8/18/2016 2:43 PM, Hans Nordeback wrote: >>>>>>>>>>>>>> osaf/libs/core/mds/mds_dt_tipc.c | 32 >>>>>>>>>>>>>> +++++++++++++++++++++++++------- >>>>>>>>>>>>>> 1 files changed, 25 insertions(+), 7 deletions(-) >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> diff --git a/osaf/libs/core/mds/mds_dt_tipc.c >>>>>>>>>>>>>> b/osaf/libs/core/mds/mds_dt_tipc.c >>>>>>>>>>>>>> --- a/osaf/libs/core/mds/mds_dt_tipc.c >>>>>>>>>>>>>> +++ b/osaf/libs/core/mds/mds_dt_tipc.c >>>>>>>>>>>>>> @@ -320,6 +320,15 @@ uint32_t mdtm_tipc_init(NODE_ID nodeid, >>>>>>>>>>>>>> m_MDS_LOG_INFO("MDTM: Successfully set >>>>>>>>>>>>>> default socket option TIPC_IMP = %d", TIPCIMPORTANCE); >>>>>>>>>>>>>> } >>>>>>>>>>>>>> + int droppable = 0; >>>>>>>>>>>>>> + if (setsockopt(tipc_cb.BSRsock, SOL_TIPC, >>>>>>>>>>>>>> TIPC_DEST_DROPPABLE, &droppable, sizeof(droppable)) != 0) { >>>>>>>>>>>>>> + LOG_ER("MDTM: Can't set >>>>>>>>>>>>>> TIPC_DEST_DROPPABLE to >>>>>>>>>>>>>> + zero >>>>>>>>>>>>>> err :%s\n", strerror(errno)); >>>>>>>>>>>>>> + m_MDS_LOG_ERR("MDTM: Can't set >>>>>>>>>>>>>> + TIPC_DEST_DROPPABLE >>>>>>>>>>>>>> to zero err :%s\n", strerror(errno)); >>>>>>>>>>>>>> + osafassert(0); >>>>>>>>>>>>>> + } else { >>>>>>>>>>>>>> + m_MDS_LOG_NOTIFY("MDTM: Successfully set >>>>>>>>>>>>>> TIPC_DEST_DROPPABLE to zero"); >>>>>>>>>>>>>> + } >>>>>>>>>>>>>> + >>>>>>>>>>>>>> return NCSCC_RC_SUCCESS; >>>>>>>>>>>>>> } >>>>>>>>>>>>>> @@ -563,6 +572,8 @@ ssize_t recvfrom_connectionless >>>>>>>>>>>>>> (int sd, >>>>>>>>>>>>>> unsigned char *cptr; >>>>>>>>>>>>>> int i; >>>>>>>>>>>>>> int has_addr; >>>>>>>>>>>>>> + int anc_data[2]; >>>>>>>>>>>>>> + >>>>>>>>>>>>>> ssize_t sz; >>>>>>>>>>>>>> has_addr = (from != NULL) && (addrlen != NULL); >>>>>>>>>>>>>> @@ >>>>>>>>>>>>>> -591,19 >>>>>>>>>>>>>> +602,26 @@ ssize_t recvfrom_connectionless (int sd, >>>>>>>>>>>>>> if the message was sent using a TIPC >>>>>>>>>>>>>> name or name sequence as the >>>>>>>>>>>>>> destination rather than a TIPC port ID >>>>>>>>>>>>>> So abort for TIPC_ERRINFO and TIPC_RETDATA*/ >>>>>>>>>>>>>> if (anc->cmsg_type == TIPC_ERRINFO) { >>>>>>>>>>>>>> - /* TIPC_ERRINFO - TIPC error code >>>>>>>>>>>>>> associated with a >>>>>>>>>>>>>> returned data message or a connection termination message >>>>>>>>>>>>>> so abort */ >>>>>>>>>>>>>> - m_MDS_LOG_CRITICAL("MDTM: undelivered message >>>>>>>>>>>>>> condition ancillary data: TIPC_ERRINFO abort err :%s", >>>>>>>>>>>>>> strerror(errno) ); >>>>>>>>>>>>>> - abort(); >>>>>>>>>>>>>> + anc_data[0] = *((unsigned >>>>>>>>>>>>>> int*)(CMSG_DATA(anc) + >>>>>>>>>>>>>> 0)); >>>>>>>>>>>>>> + if (anc_data[0] == TIPC_ERR_OVERLOAD) { >>>>>>>>>>>>>> + LOG_CR("MDTM: undelivered message >>>>>>>>>>>>>> condition >>>>>>>>>>>>>> ancillary data: TIPC_ERR_OVERLOAD"); >>>>>>>>>>>>>> + m_MDS_LOG_CRITICAL("MDTM: undelivered message >>>>>>>>>>>>>> condition ancillary data: TIPC_ERR_OVERLOAD"); >>>>>>>>>>>>>> + } else { >>>>>>>>>>>>>> + /* TIPC_ERRINFO - TIPC error code >>>>>>>>>>>>>> associated >>>>>>>>>>>>>> with a returned data message or a connection termination >>>>>>>>>>>>>> message so abort */ >>>>>>>>>>>>>> + LOG_CR("MDTM: undelivered message >>>>>>>>>>>>>> condition >>>>>>>>>>>>>> ancillary data: TIPC_ERRINFO abort err : %d", anc_data[0]); >>>>>>>>>>>>>> + m_MDS_LOG_CRITICAL("MDTM: undelivered message >>>>>>>>>>>>>> condition ancillary data: TIPC_ERRINFO abort err : %d", >>>>>>>>>>>>>> anc_data[0]); >>>>>>>>>>>>>> + } >>>>>>>>>>>>>> } else if (anc->cmsg_type == TIPC_RETDATA) { >>>>>>>>>>>>>> - /* If we set TIPC_DEST_DROPPABLE off messge >>>>>>>>>>>>>> (configure TIPC to return rejected messages to the sender ) >>>>>>>>>>>>>> + /* If we set TIPC_DEST_DROPPABLE off >>>>>>>>>>>>>> + message >>>>>>>>>>>>>> (configure TIPC to return rejected messages to the sender ) >>>>>>>>>>>>>> we will hit this when we implement >>>>>>>>>>>>>> MDS retransmit lost messages abort can be replaced with >>>>>>>>>>>>>> flow control logic*/ >>>>>>>>>>>>>> for (i = anc->cmsg_len - sizeof(*anc); >>>>>>>>>>>>>> i > 0; >>>>>>>>>>>>>> i--) { >>>>>>>>>>>>>> - m_MDS_LOG_DBG("MDTM: returned byte >>>>>>>>>>>>>> 0x%02x\n", >>>>>>>>>>>>>> *cptr); >>>>>>>>>>>>>> + LOG_CR("MDTM: returned byte 0x%02x\n", >>>>>>>>>>>>>> *cptr); >>>>>>>>>>>>>> + m_MDS_LOG_CRITICAL("MDTM: returned byte >>>>>>>>>>>>>> 0x%02x\n", *cptr); >>>>>>>>>>>>>> cptr++; >>>>>>>>>>>>>> } >>>>>>>>>>>>>> /* TIPC_RETDATA -The contents of a >>>>>>>>>>>>>> returned data message so abort */ >>>>>>>>>>>>>> - m_MDS_LOG_CRITICAL("MDTM: undelivered message >>>>>>>>>>>>>> condition ancillary data: TIPC_RETDATA abort err :%s", >>>>>>>>>>>>>> strerror(errno) ); >>>>>>>>>>>>>> - abort(); >>>>>>>>>>>>>> + LOG_CR("MDTM: undelivered message >>>>>>>>>>>>>> + condition >>>>>>>>>>>>>> ancillary data: TIPC_RETDATA"); >>>>>>>>>>>>>> + m_MDS_LOG_CRITICAL("MDTM: undelivered >>>>>>>>>>>>>> + message >>>>>>>>>>>>>> condition ancillary data: TIPC_RETDATA"); >>>>>>>>>>>>>> } else if (anc->cmsg_type == TIPC_DESTNAME) { >>>>>>>>>>>>>> if (sz == 0) { >>>>>>>>>>>>>> m_MDS_LOG_DBG("MDTM: recd bytes=0 on received on sock, >>>>>>>>>>>>>> abnormal/unknown condition. Ignoring"); > > ------------------------------------------------------------------------------ > _______________________________________________ > Opensaf-devel mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/opensaf-devel ------------------------------------------------------------------------------ _______________________________________________ Opensaf-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/opensaf-devel
