Hi HansN,

I just tested with uniform buffer sizes in all nodes and sending 
messages with normal phase the results looks OK,
even after hitting the TIPC_ERR_OVERLOAD.

So my conclusion is, in general all node will have same buffer sizes let 
us go with V2  patch,  any  how GA is tagged ,
so we have enough time for testing and if we get some issues we can 
resolve them by next release.

==================================================================================================
 


Sep 21 11:51:40 SC-1 osafamfd[15792]: NO Node 'PL-4' joined the cluster
Sep 21 11:51:40 SC-1 osafimmnd[15741]: NO Implementer connected: 17 
(MsgQueueService132111) <0, 2040f>
Sep 21 11:52:41 SC-1 osafimmd[15730]: 77 MDTM: undelivered message 
condition ancillary data size: 0 : TIPC_ERR_OVERLOAD
Sep 21 11:52:41 SC-1 osafimmd[15730]: 7777 MDTM: undelivered message 
condition ancillary data: TIPC_RETDATA
Sep 21 11:52:41 SC-1 osafimmd[15730]: 77 MDTM: undelivered message 
condition ancillary data size: 0 : TIPC_ERR_OVERLOAD
Sep 21 11:52:41 SC-1 osafimmd[15730]: 7777 MDTM: undelivered message 
condition ancillary data: TIPC_RETDATA
Sep 21 11:52:41 SC-1 osafimmd[15730]: 77 MDTM: undelivered message 
condition ancillary data size: 0 : TIPC_ERR_OVERLOAD
Sep 21 11:52:41 SC-1 osafimmd[15730]: 7777 MDTM: undelivered message 
condition ancillary data: TIPC_RETDATA
Sep 21 11:52:41 SC-1 osafimmd[15730]: 77 MDTM: undelivered message 
condition ancillary data size: 0 : TIPC_ERR_OVERLOAD
Sep 21 11:52:41 SC-1 osafimmd[15730]: 7777 MDTM: undelivered message 
condition ancillary data: TIPC_RETDATA
Sep 21 11:52:41 SC-1 osafimmd[15730]: 77 MDTM: undelivered message 
condition ancillary data size: 0 : TIPC_ERR_OVERLOAD
Sep 21 11:52:41 SC-1 osafimmd[15730]: 7777 MDTM: undelivered message 
condition ancillary data: TIPC_RETDATA
Sep 21 11:52:41 SC-1 osafimmd[15730]: 77 MDTM: undelivered message 
condition ancillary data size: 0 : TIPC_ERR_OVERLOAD
Sep 21 11:52:41 SC-1 osafimmd[15730]: 7777 MDTM: undelivered message 
condition ancillary data: TIPC_RETDATA
Sep 21 11:52:41 SC-1 osafimmd[15730]: 77 MDTM: undelivered message 
condition ancillary data size: 0 : TIPC_ERR_OVERLOAD
Sep 21 11:52:41 SC-1 osafimmd[15730]: 7777 MDTM: undelivered message 
condition ancillary data: TIPC_RETDATA
Sep 21 11:52:41 SC-1 osafimmd[15730]: 77 MDTM: undelivered message 
condition ancillary data size: 0 : TIPC_ERR_OVERLOAD
Sep 21 11:52:41 SC-1 osafimmd[15730]: 7777 MDTM: undelivered message 
condition ancillary data: TIPC_RETDATA
Sep 21 11:52:41 SC-1 osafimmd[15730]: 77 MDTM: undelivered message 
condition ancillary data size: 0 : TIPC_ERR_OVERLOAD
Sep 21 11:52:41 SC-1 osafimmd[15730]: 7777 MDTM: undelivered message 
condition ancillary data: TIPC_RETDATA
Sep 21 11:52:41 SC-1 osafimmd[15730]: 77 MDTM: undelivered message 
condition ancillary data size: 0 : TIPC_ERR_OVERLOAD
Sep 21 11:52:41 SC-1 osafimmd[15730]: 7777 MDTM: undelivered message 
condition ancillary data: TIPC_RETDATA
Sep 21 11:52:41 SC-1 osafimmd[15730]: 77 MDTM: undelivered message 
condition ancillary data size: 0 : TIPC_ERR_OVERLOAD
Sep 21 11:52:41 SC-1 osafimmd[15730]: 7777 MDTM: undelivered message 
condition ancillary data: TIPC_RETDATA
Sep 21 11:52:41 SC-1 osafimmd[15730]: 77 MDTM: undelivered message 
condition ancillary data size: 0 : TIPC_ERR_OVERLOAD
Sep 21 11:52:41 SC-1 osafimmd[15730]: 7777 MDTM: undelivered message 
condition ancillary data: TIPC_RETDATA
Sep 21 11:52:41 SC-1 osafimmd[15730]: 77 MDTM: undelivered message 
condition ancillary data size: 0 : TIPC_ERR_OVERLOAD
Sep 21 11:52:41 SC-1 osafimmd[15730]: 7777 MDTM: undelivered message 
condition ancillary data: TIPC_RETDATA
Sep 21 11:52:41 SC-1 osafimmd[15730]: 77 MDTM: undelivered message 
condition ancillary data size: 0 : TIPC_ERR_OVERLOAD
Sep 21 11:52:41 SC-1 osafimmd[15730]: 7777 MDTM: undelivered message 
condition ancillary data: TIPC_RETDATA
Sep 21 11:52:41 SC-1 osafimmd[15730]: 77 MDTM: undelivered message 
condition ancillary data size: 0 : TIPC_ERR_OVERLOAD
Sep 21 11:52:41 SC-1 osafimmd[15730]: 7777 MDTM: undelivered message 
condition ancillary data: TIPC_RETDATA
Sep 21 11:52:41 SC-1 osafimmd[15730]: 77 MDTM: undelivered message 
condition ancillary data size: 0 : TIPC_ERR_OVERLOAD
Sep 21 11:52:41 SC-1 osafimmd[15730]: 7777 MDTM: undelivered message 
condition ancillary data: TIPC_RETDATA
Sep 21 11:52:41 SC-1 osafimmd[15730]: 77 MDTM: undelivered message 
condition ancillary data size: 0 : TIPC_ERR_OVERLOAD
Sep 21 11:52:41 SC-1 osafimmd[15730]: 7777 MDTM: undelivered message 
condition ancillary data: TIPC_RETDATA

==================================================================================================


On 9/21/2016 11:37 AM, A V Mahesh wrote:
> Hi HansN,
>
> On 9/20/2016 4:17 PM, Hans Nordebäck wrote:
>> Hi Mahesh,
>>
>> I think only logging is needed as proposed in the patch, as some services 
>> are already handling dropped messages. This logging will help in
>> trouble shooting. Keeping TIPC_DEST_DROPPABLE to true will only make TIPC to 
>> silently drop messages, the original problem persists and needs 
>> investigation,
>> i.e. why the socket receive buffer is overloaded, one reason may be that the 
>> MDS poll/receive loop together with the "big" mutex lock, (ticket #520).
> [AVM]   One valid reason could be, in case of  TIPC_ERR_OVERLOAD
> recd_bytes is NOT zero ,  so buffer is overloaded can occur at TIPC or
> MDS level ,
>                 I  will investigate more and update.
>
>> Did you check why MDS message loss mechanism doesn't detect on TIPC dropped 
>> messages, AMF
>> do detect this via e.g "out of sync", "msg id mismatch" and so on?
> [AVM]  You mean  IMMD  message loss mechanism ?
>
> -AVM
>> /Regards HansN
>>
>> -----Original Message-----
>> From: A V Mahesh [mailto:mahesh.va...@oracle.com]
>> Sent: den 20 september 2016 12:29
>> To: Anders Widell <anders.wid...@ericsson.com>; Hans Nordebäck 
>> <hans.nordeb...@ericsson.com>
>> Cc: opensaf-devel@lists.sourceforge.net; mathi.naic...@oracle.com
>> Subject: Re: [PATCH 1 of 1] MDS: Log TIPC dropped messages [#1957]
>>
>> HI Anders Widell / HansN,
>>
>> On 9/16/2016 2:03 PM, Anders Widell wrote:
>>> The idea was to just log reception of error info messages, for
>>> trouble-shooting purposes.
>> After multiple attempts,  i manged to simulate TIPC_ERR_OVERLOAD
>> error.    After  TIPC_ERR_OVERLOAD error is hit
>> the cluster going to UN-recoverable state , because the send buffers are 
>> full.
>>
>> So we have two options :
>>
>> 1)  Set  TIPC_DEST_DROPPABLE to false ,  log TIPC_ERR_OVERLOAD error and 
>> then  graceful  exist of sender,
>>         which allows remaining nodes to be survived.
>>
>> 2)  keep the current configuration as it is ( TIPC_DEST_DROPPABLE to true )
>>
>> =================================================================================================================
>> Sep 20 15:14:09 SC-1 osafamfd[3759]: NO Received node_up from 2040f:
>> msg_id 1
>> Sep 20 15:14:09 SC-1 osafamfd[3759]: NO Node 'PL-4' joined the cluster Sep 
>> 20 15:14:09 SC-1 osafimmnd[3695]: NO Implementer connected: 19
>> (MsgQueueService132111) <0, 2040f>
>> *Sep 20 15:16:59 SC-1 osafimmd[3684]: 77 MDTM: undelivered message condition 
>> ancillary data: TIPC_ERR_OVERLOAD* Sep 20 15:17:00 SC-1 osafimmnd[3695]: WA 
>> Director Service in NOACTIVE state - fevs replies pending:1 fevs highest 
>> processed:218744 Sep 20 15:17:00 SC-1 osafamfnd[3773]: NO 
>> 'safComp=IMMD,safSu=SC-1,safSg=2N,safApp=OpenSAF' faulted due to 'avaDown' : 
>> Recovery is 'nodeFailfast'
>> Sep 20 15:17:00 SC-1 osafamfnd[3773]: ER 
>> safComp=IMMD,safSu=SC-1,safSg=2N,safApp=OpenSAF Faulted due to:avaDown 
>> Recovery is:nodeFailfast Sep 20 15:17:00 SC-1 osafamfnd[3773]: Rebooting 
>> OpenSAF NodeId = 131343 EE Name = , Reason: Component faulted: recovery is 
>> node failfast, OwnNodeId = 131343, SupervisionTime = 60 Sep 20 15:17:00 SC-1 
>> osafimmnd[3695]: WA DISCARD DUPLICATE FEVS
>> message:218744
>> Sep 20 15:17:00 SC-1 osafimmnd[3695]: WA Error code 2 returned for message 
>> type 82 - ignoring Sep 20 15:17:00 SC-1 opensaf_reboot: Rebooting local 
>> node; timeout=60 Sep 20 15:17:00 SC-1 osafimmnd[3695]: WA SC Absence IS 
>> allowed:900 IMMD service is DOWN Sep 20 15:17:00 SC-1 osafimmnd[3695]: NO 
>> IMMD SERVICE IS DOWN, HYDRA IS CONFIGURED => UNREGISTERING IMMND form MDS 
>> Sep 20 15:17:00 SC-1 osafntfimcnd[3742]: NO saImmOiDispatch() Fail 
>> SA_AIS_ERR_BAD_HANDLE (9) Sep 20 15:17:00 SC-1 osafimmnd[3695]: NO Removing 
>> client id:20002010f
>> sv_id:27
>> Sep 20 15:17:00 SC-1 osafimmnd[3695]: NO Implementer disconnected 1 <2,
>> 2010f> (safLogService)
>> Sep 20 15:17:00 SC-1 osafimmnd[3695]: NO Removing client id:d0d0002010f
>> sv_id:26
>> Sep 20 15:17:00 SC-1 osafimmnd[3695]: NO Removing client id:100002010f
>> sv_id:27
>> Sep 20 15:17:00 SC-1 osafimmnd[3695]: NO Implementer disconnected 2 <16,
>> 2010f> (@safLogService_appl)
>> Sep 20 15:17:00 SC-1 osafimmnd[3695]: NO Removing client id:130002010f
>> sv_id:27
>> Sep 20 15:17:00 SC-1 osafimmnd[3695]: NO Implementer disconnected 3 <19,
>> 2010f> (@OpenSafImmReplicatorA)
>> Sep 20 15:17:00 SC-1 osafimmnd[3695]: NO Removing client id:140002010f
>> sv_id:26
>> Sep 20 15:17:00 SC-1 osafimmnd[3695]: NO Removing client id:150002010f
>> sv_id:27
>> Sep 20 15:17:00 SC-1 osafimmnd[3695]: NO Implementer disconnected 4 <21,
>> 2010f> (safClmService)
>> Sep 20 15:17:00 SC-1 osafimmnd[3695]: NO Removing client id:1a0002010f
>> sv_id:27
>> Sep 20 15:17:00 SC-1 osafimmnd[3695]: NO Implementer disconnected 5 <26,
>> 2010f> (safAmfService)
>> Sep 20 15:17:00 SC-1 osafimmnd[3695]: NO Removing client id:1b0002010f
>> sv_id:26
>> Sep 20 15:17:00 SC-1 osafimmnd[3695]: NO Removing client id:5bc0002010f
>> sv_id:26
>> Sep 20 15:17:00 SC-1 osafimmnd[3695]: NO Removing client id:5bd0002010f
>> sv_id:27
>> Sep 20 15:17:00 SC-1 osafimmnd[3695]: NO Implementer disconnected 6 <1469, 
>> 2010f> (MsgQueueService131343) Sep 20 15:17:00 SC-1 osafimmnd[3695]: NO 
>> Removing client id:5c00002010f
>> sv_id:27
>> Sep 20 15:17:00 SC-1 osafimmnd[3695]: NO Implementer disconnected 10 <1472, 
>> 2010f> (safEvtService) Sep 20 15:17:00 SC-1 osafimmnd[3695]: NO Removing 
>> client id:5c40002010f
>> sv_id:27
>> Sep 20 15:17:00 SC-1 osafimmnd[3695]: NO Implementer disconnected 8 <1476, 
>> 2010f> (safSmfService) Sep 20 15:17:00 SC-1 osafimmnd[3695]: NO Removing 
>> client id:5c60002010f
>> sv_id:27
>> Sep 20 15:17:00 SC-1 osafimmnd[3695]: NO Implementer disconnected 9 <1478, 
>> 2010f> (safLckService) Sep 20 15:17:00 SC-1 osafimmnd[3695]: NO Removing 
>> client id:5c70002010f
>> sv_id:27
>> Sep 20 15:17:00 SC-1 osafimmnd[3695]: NO Implementer disconnected 7 <1479, 
>> 2010f> (safMsgGrpService) Sep 20 15:17:00 SC-1 osafimmnd[3695]: NO Removing 
>> client id:5cc0002010f
>> sv_id:27
>> Sep 20 15:17:00 SC-1 osafimmnd[3695]: NO Removing client id:5ce0002010f
>> sv_id:27
>> Sep 20 15:17:00 SC-1 osafimmnd[3695]: NO Implementer disconnected 12 <1486, 
>> 2010f> (safCheckPointService) Sep 20 15:17:00 SC-1 osafimmnd[3695]: NO 
>> Implementer disconnected 13 <0, 2020f(down)> (MsgQueueService131599) Sep 20 
>> 15:17:00 SC-1 osafimmnd[3695]: NO Implementer disconnected 14 <0, 
>> 2020f(down)> (@OpenSafImmReplicatorB) Sep 20 15:17:00 SC-1 osafimmnd[3695]: 
>> NO Implementer disconnected 15 <0, 2020f(down)> (@safAmfService2020f) Sep 20 
>> 15:17:00 SC-1 osafimmnd[3695]: NO Impl Discarded node 2020f Sep 20 15:17:00 
>> SC-1 osafimmnd[3695]: NO Implementer disconnected 16 <0, 2030f(down)> 
>> (MsgQueueService131855) Sep 20 15:17:00 SC-1 osafimmnd[3695]: NO Impl 
>> Discarded node 2030f Sep 20 15:17:00 SC-1 osafimmnd[3695]: NO Implementer 
>> disconnected 19 <0, 2040f(down)> (MsgQueueService132111) Sep 20 15:17:00 
>> SC-1 osafimmnd[3695]: NO Impl Discarded node 2040f Sep 20 15:17:00 SC-1 
>> osafimmnd[3695]: NO MDS unregisterede. sleeping ...
>> Sep 20 15:17:01 SC-1 osafimmnd[3695]: NO Sleep done registering IMMND with 
>> MDS Sep 20 15:17:01 SC-1 osafimmnd[3695]: NO MDS: mds_register_callback:
>> dest 2010fe8fa0043 already exist
>> Sep 20 15:17:01 SC-1 osafimmnd[3695]: NO MDS: mds_register_callback:
>> dest 2010fdcb60040 already exist
>> Sep 20 15:17:01 SC-1 osafimmnd[3695]: NO MDS: mds_register_callback:
>> dest 2010fdcb6002e already exist
>> Sep 20 15:17:01 SC-1 osafimmnd[3695]: NO MDS: mds_register_callback:
>> dest 2010fdcb60037 already exist
>> Sep 20 15:17:01 SC-1 osafimmnd[3695]: NO MDS: mds_register_callback:
>> dest 2010fdcb60028 already exist
>> Sep 20 15:17:01 SC-1 osafimmnd[3695]: NO MDS: mds_register_callback:
>> dest 2010fdcb6003d already exist
>> Sep 20 15:17:01 SC-1 osafimmnd[3695]: NO MDS: mds_register_callback:
>> dest 2010fdcb6002b already exist
>> Sep 20 15:17:01 SC-1 osafimmnd[3695]: NO MDS: mds_register_callback:
>> dest 2010fdcb6001c already exist
>> Sep 20 15:17:01 SC-1 osafimmnd[3695]: NO MDS: mds_register_callback:
>> dest 2010fdcb60019 already exist
>> Sep 20 15:17:01 SC-1 osafimmnd[3695]: NO MDS: mds_register_callback:
>> dest 2010fdcba0012 already exist
>> Sep 20 15:17:01 SC-1 osafimmnd[3695]: NO MDS: mds_register_callback:
>> dest 2010fdcb60028 already exist
>> Sep 20 15:17:01 SC-1 osafimmnd[3695]: NO MDS: mds_register_callback:
>> dest 2010fdcb60019 already exist
>> Sep 20 15:17:01 SC-1 osafimmnd[3695]: NO SUCCESS IN REGISTERING IMMND WITH 
>> MDS Sep 20 15:17:01 SC-1 osafimmnd[3695]: NO Re-introduce-me
>> highestProcessed:218744 highestReceived:218744 Sep 20 15:17:03 SC-1 kernel: 
>> [ 1794.198381] md: stopping all md devices.
>> Sep 20 15:17:03 SC-1 osafntfimcnd[8997]: WA ntfimcn_imm_init
>> saImmOiInitialize_2() returned SA_AIS_ERR_TIMEOUT (5) Sep 20 15:18:00 SC-1 
>> syslog-ng[1221]: syslog-ng starting up; version='2.0.9'
>> =================================================================================================================
>>
>> -AVM
>>
>> On 9/16/2016 2:03 PM, Anders Widell wrote:
>>> I don't think we need (or even should) inform the sender when MDS
>>> receives an error information message from TIPC. Note that these error
>>> information messages are received asynchronously, when the sender has
>>> already received an OK return code from the MDS send call. The idea
>>> was to just log reception of error info messages, for trouble-shooting
>>> purposes. We already have a mechanism in MDS that informs the receiver
>>> about lost MDS messages. If we wish to inform the sender we would need
>>> to introduce a second mechanism in MDS, and at this point I don't
>>> think it is needed. Another approach we could consider is that MDS
>>> retransmits the message transparently without informing the sender.
>>> This would require MDS to internally store sent messages for a while,
>>> so that they can be retransmitted. It would also require the receiver
>>> to re-order received messages, since a retransmitted message will be
>>> received out of sequence.
>>>
>>> regards,
>>>
>>> Anders Widell
>>>
>>>
>>> On 09/16/2016 06:40 AM, A V Mahesh wrote:
>>>> Hi HansN,
>>>>
>>>> I managed to create TIPC_ERRINFO/TIPC_RETDATA  error cases ( not
>>>> TIPC_ERR_OVERLOAD error )  with normal messages and It is observed
>>>> that  TIPC_DEST_DROPPABLE set to true even error TIPC_ERRINFO is NOT
>>>> notified ( it means TIPC_ERR_OVERLOAD ) , if TIPC_DEST_DROPPABLE set
>>>> to false TIPC_ERRINFO/TIPC_RETDATA errors are notified.
>>>>
>>>> Now I will also check implication of TIPC_DEST_DROPPABLE set to false
>>>> on multicast and broadcast  messages, based on that we can re-arrange
>>>> the TIPC_DEST_DROPPABLE setting to false conditions  based on agent
>>>> `i_msg_loss_indication = true` condition mds can return to agent the
>>>> same error  TIPC_ERR_OVERLOAD.
>>>>
>>>> TIPC_DEST_DROPPABLE to false:
>>>>
>>>> ==================================================================
>>>>
>>>> Sep 15 16:10:39 SC-1 osafimmnd[32051]: NO Implementer disconnected 13
>>>> <0, 2040f> (MsgQueueService132111) Sep 15 16:10:39 SC-1
>>>> osafimmd[32040]:  777 MDTM: undelivered message condition ancillary
>>>> data: TIPC_ERRINFO abort err : 2 Sep 15 16:10:39 SC-1
>>>> osafimmd[32040]: 7777 MDTM: undelivered message condition ancillary
>>>> data: TIPC_RETDATA Sep 15 16:10:39 SC-1 osafimmd[32040]: NO MDS event
>>>> from svc_id 25 (change:4, dest:567413369208836) Sep 15 16:10:39 SC-1
>>>> osafimmd[32040]:  777 MDTM: undelivered message condition ancillary
>>>> data: TIPC_ERRINFO abort err : 2 Sep 15 16:10:39 SC-1
>>>> osafimmd[32040]: 7777 MDTM: undelivered message condition ancillary
>>>> data: TIPC_RETDATA Sep 15 16:10:39 SC-1 osafimmd[32040]:  777 MDTM:
>>>> undelivered message condition ancillary data: TIPC_ERRINFO abort err
>>>> : 2 Sep 15 16:10:39 SC-1 osafimmd[32040]: 7777 MDTM: undelivered
>>>> message condition ancillary data: TIPC_RETDATA Sep 15 16:10:39 SC-1
>>>> osafimmd[32040]:  777 MDTM: undelivered message condition ancillary
>>>> data: TIPC_ERRINFO abort err : 2 Sep 15 16:10:39 SC-1
>>>> osafimmd[32040]: 7777 MDTM: undelivered message condition ancillary
>>>> data: TIPC_RETDATA Sep 15 16:10:39 SC-1 osafimmd[32040]:  777 MDTM:
>>>> undelivered message condition ancillary data: TIPC_ERRINFO abort err
>>>> : 2 Sep 15 16:10:39 SC-1 osafimmd[32040]: 7777 MDTM: undelivered
>>>> message condition ancillary data: TIPC_RETDATA Sep 15 16:10:39 SC-1
>>>> osafimmd[32040]:  777 MDTM: undelivered message condition ancillary
>>>> data: TIPC_ERRINFO abort err : 2 Sep 15 16:10:39 SC-1
>>>> osafimmd[32040]: 7777 MDTM: undelivered message condition ancillary
>>>> data: TIPC_RETDATA Sep 15 16:10:39 SC-1 osafimmd[32040]:  777 MDTM:
>>>> undelivered message condition ancillary data: TIPC_ERRINFO abort err
>>>> : 2 Sep 15 16:10:39 SC-1 osafimmd[32040]: 7777 MDTM: undelivered
>>>> message condition ancillary data: TIPC_RETDATA Sep 15 16:10:39 SC-1
>>>> osafimmd[32040]:  777 MDTM: undelivered message condition ancillary
>>>> data: TIPC_ERRINFO abort err : 2 Sep 15 16:10:39 SC-1
>>>> osafimmd[32040]: 7777 MDTM: undelivered message condition ancillary
>>>> data: TIPC_RETDATA Sep 15 16:10:39 SC-1 osafimmd[32040]:  777 MDTM:
>>>> undelivered message condition ancillary data: TIPC_ERRINFO abort err
>>>> : 2 Sep 15 16:10:39 SC-1 osafimmd[32040]: 7777 MDTM: undelivered
>>>> message condition ancillary data: TIPC_RETDATA Sep 15 16:10:39 SC-1
>>>> osafimmd[32040]:  777 MDTM: undelivered message condition ancillary
>>>> data: TIPC_ERRINFO abort err : 2 Sep 15 16:10:39 SC-1
>>>> osafimmd[32040]: 7777 MDTM: undelivered message condition ancillary
>>>> data: TIPC_RETDATA Sep 15 16:10:39 SC-1 osafamfd[32114]: NO Node
>>>> 'PL-4' left the cluster
>>>>
>>>> ==================================================================
>>>>
>>>> TIPC_DEST_DROPPABLE to true:
>>>>
>>>> ==================================================================
>>>>
>>>> Sep 15 15:59:55 SC-1 osafimmnd[26461]: NO Implementer disconnected 13
>>>> <0, 2040f> (MsgQueueService132111) Sep 15 15:59:55 SC-1
>>>> osafimmd[26450]: NO MDS event from svc_id 25 (change:4,
>>>> dest:567412923957252) Sep 15 15:59:55 SC-1 osafimmnd[26461]: NO
>>>> Global discard node received for nodeId:2040f pid:410 Sep 15 15:59:55
>>>> SC-1 osafamfd[28810]: NO Node 'PL-4' left the cluster Sep 15 15:59:58
>>>> SC-1 kernel: [ 5147.648737] tipc: Resetting link
>>>> <1.1.1:eth0-1.1.4:eth0>, peer not responding Sep 15 15:59:58 SC-1
>>>> kernel: [ 5147.648756] tipc: Lost link <1.1.1:eth0-1.1.4:eth0> on
>>>> network plane A Sep 15 15:59:58 SC-1 kernel: [ 5147.648771] tipc:
>>>> Lost contact with <1.1.4>
>>>>
>>>> ==================================================================
>>>>
>>>> -AVM
>>>>
>>>>
>>>> On 9/1/2016 10:59 AM, Hans Nordebäck wrote:
>>>>> Hi Mahesh,
>>>>>
>>>>> I have not tested this, but the following should work:
>>>>>
>>>>> - Set BSRsock TIPC_IMPORTANCE to TIPC_LOW_IMPORTANCE
>>>>>
>>>>> - set socket receive buffer to a small value:
>>>>>
>>>>>     optval = "small socket recieive buffer size" , 5000 ?
>>>>>
>>>>>     setsockopt(tipc_cb.BSRsock, SOL_SOCKET, SO_RCVBUF, &optval,
>>>>> optlen)
>>>>>
>>>>> -  sysctl -w net.tipc.tipc_rmem="5000 40000000 68240400" (or smaller
>>>>> values)
>>>>>
>>>>> - add some delays when processing messages in
>>>>> mdtm_process_recv_events(), to provoke overloading the socket
>>>>> receive buffer.
>>>>>
>>>>> We experience dropped packages in a 75 node system, and as a
>>>>> workaround increasing the default so receive buffer size it seems
>>>>> working for that setup.
>>>>>
>>>>> /Thanks HansN
>>>>>
>>>>> On 09/01/2016 05:50 AM, A V Mahesh wrote:
>>>>>> Hi HansN,
>>>>>>
>>>>>> Do you have any tips to created overload case,
>>>>>>
>>>>>> I would like test and observe TIPC_DEST_DROPPABLE enabled &
>>>>>> disabled cases.
>>>>>>
>>>>>> -AVM
>>>>>>
>>>>>>
>>>>>> On 9/1/2016 9:12 AM, A V Mahesh wrote:
>>>>>>> Hi HansN,
>>>>>>>
>>>>>>> Sorry for the delay.
>>>>>>>
>>>>>>> I will test it and get back to you soon.
>>>>>>>
>>>>>>> -AVM
>>>>>>>
>>>>>>>
>>>>>>> On 8/31/2016 4:29 PM, Hans Nordebäck wrote:
>>>>>>>> Hi Mahesh,
>>>>>>>> Any updates on this?
>>>>>>>>
>>>>>>>> /Regards HansN
>>>>>>>>
>>>>>>>> -----Original Message-----
>>>>>>>> From: Anders Widell
>>>>>>>> Sent: den 25 augusti 2016 13:11
>>>>>>>> To: A V Mahesh <mahesh.va...@oracle.com>; Hans Nordebäck
>>>>>>>> <hans.nordeb...@ericsson.com>; mathi.naic...@oracle.com
>>>>>>>> Cc: opensaf-devel@lists.sourceforge.net
>>>>>>>> Subject: Re: [PATCH 1 of 1] MDS: Log TIPC dropped messages
>>>>>>>> [#1957]
>>>>>>>>
>>>>>>>> Hi!
>>>>>>>>
>>>>>>>> This is what the TIPC user documentation says about
>>>>>>>> TIPC_DEST_DROPPABLE:
>>>>>>>> "This option governs the handling of messages sent by the socket
>>>>>>>> if the message cannot be delivered to its destination, either
>>>>>>>> because the receiver is congested or because the specified
>>>>>>>> receiver does not exist.
>>>>>>>> If enabled, the message is discarded; otherwise the message is
>>>>>>>> returned to the sender."
>>>>>>>>
>>>>>>>> This is what the TIPC user documentation says about the return
>>>>>>>> value from the recvmsg() system call: "When used with a
>>>>>>>> connectionless socket, a return value of 0 indicates the arrival
>>>>>>>> of a returned data message that was originally sent by this socket."
>>>>>>>>
>>>>>>>> I think the documentation is pretty clear. If you set
>>>>>>>> TIPC_DEST_DROPPABLE to true, the receiver can discard messages
>>>>>>>> e.g. when the receive buffer is full. The sender will not be
>>>>>>>> notified in this case. If TIPC_DEST_DROPPABLE is set to false,
>>>>>>>> the message will be returned to the sender in case of a full
>>>>>>>> receive buffer. The sender knows that it has received such a
>>>>>>>> returned message when the recvmsg() call returns zero.
>>>>>>>>
>>>>>>>> regards,
>>>>>>>> Anders Widell
>>>>>>>>
>>>>>>>> On 08/25/2016 11:30 AM, A V Mahesh wrote:
>>>>>>>>> Hi HansN,
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 8/23/2016 5:22 PM, Hans Nordebäck wrote:
>>>>>>>>>
>>>>>>>>>> Hi Mahesh,
>>>>>>>>>>
>>>>>>>>>> Yes, this is my understanding too, if TIPC_DROPPABLE = true
>>>>>>>>>> tipc may drop messages silently,  at receive sock buffer full
>>>>>>>>>> condition,  but do not return any ancillary message.
>>>>>>>>>> If TIPC_DROPPABLE = false tipc may drop message but will send
>>>>>>>>>> an ancillary message to inform about TIPC_ERR_OVERLOAD.
>>>>>>>>> [AVM]
>>>>>>>>>
>>>>>>>>> My observation are understanding is different, based on TIPC
>>>>>>>>> code and Linux TIPC 2.0 Programmer's Guide , that the
>>>>>>>>> TIPC_ERR_OVERLOAD error returned when TIPC is unable to enqueue
>>>>>>>>> an incoming message on the receiving socket's receive queue
>>>>>>>>> irrelevant of TIPC_DEST_DROPPABLE enabled or disabled.
>>>>>>>>>
>>>>>>>>> The only difference between TIPC_DEST_DROPPABLE enabled or
>>>>>>>>> disabled is , If  TIPC_DEST_DROPPABLE enabled, the message is
>>>>>>>>> discarded and
>>>>>>>>> recvmsg() returned size is ZERO and application will get errors,
>>>>>>>>> if TIPC_DEST_DROPPABLE disabled  the message is returned to the
>>>>>>>>> sender it means the recvmsg() returned size is user send data
>>>>>>>>> size and application will get errors .
>>>>>>>>>
>>>>>>>>> I did check the TIPC code and documentations  and I haven't get
>>>>>>>>> any evidences that  TIPC_ERR_OVERLOAD error code will be send
>>>>>>>>> only If TIPC_DEST_DROPPABLE = false.
>>>>>>>>>
>>>>>>>>> Even while testing #1227
>>>>>>>>> (https://sourceforge.net/p/opensaf/mailman/message/33207717/) my
>>>>>>>>> observations and understanding was, an individual TIPC socket is
>>>>>>>>> only allowed to queue up
>>>>>>>>> OVERLOAD_LIMIT_BASE/2 messages of the lowest importance level
>>>>>>>>> before it starts rejecting them.
>>>>>>>>> Once a socket receiving queue length exceeds the maximum limit
>>>>>>>>> value, the receiving socket will send out a reject message with
>>>>>>>>> TIPC_ERR_OVERLOAD error code with cmsg_type as
>>>>>>>>> TIPC_ERRINFO/TIPC_RETDATA, and the tipc code and Linux TIPC 2.0
>>>>>>>>> Programmer's Guide  confirmed the same .
>>>>>>>>>
>>>>>>>>> tipc/socket.c
>>>>>>>>> =======================================================
>>>>>>>>> /* Reject message if there isn't room to queue it */
>>>>>>>>>
>>>>>>>>> recv_q_len = (u32)atomic_read(&tipc_queue_size);
>>>>>>>>> if (unlikely(recv_q_len >= OVERLOAD_LIMIT_BASE)) {
>>>>>>>>>        if (rx_queue_full(msg, recv_q_len, OVERLOAD_LIMIT_BASE))
>>>>>>>>>            return TIPC_ERR_OVERLOAD; } recv_q_len =
>>>>>>>>> skb_queue_len(&sk->sk_receive_queue);
>>>>>>>>> if (unlikely(recv_q_len >= (OVERLOAD_LIMIT_BASE / 2))) {
>>>>>>>>>        if (rx_queue_full(msg, recv_q_len, OVERLOAD_LIMIT_BASE / 2))
>>>>>>>>>            return TIPC_ERR_OVERLOAD; }
>>>>>>>>> =======================================================
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> 2.1.17. setsockopt() of  TIPC 2.0 Programmer's Guide
>>>>>>>>> =======================================================
>>>>>>>>> TIPC_DEST_DROPPABLE
>>>>>>>>> This option governs the handling of messages sent by the socket
>>>>>>>>> if the message cannot be delivered to its destination, either
>>>>>>>>> because the receiver is congested or because the specified
>>>>>>>>> receiver does not exist. If enabled, the message is discarded;
>>>>>>>>> otherwise the message is returned to the sender.
>>>>>>>>>
>>>>>>>>> By default, this option is disabled for SOCK_SEQPACKET and
>>>>>>>>> SOCK_STREAM socket types, and enabled for SOCK_RDM and
>>>>>>>>> SOCK_DGRAM, This arrangement ensures proper teardown of failed
>>>>>>>>> connections when connection-oriented data transfer is used,
>>>>>>>>> without increasing the complexity of connectionless data
>>>>>>>>> transfer.
>>>>>>>>>
>>>>>>>>> TIPC_SRC_DROPPABLE
>>>>>>>>> This option governs the handling of messages sent by the socket
>>>>>>>>> if link congestion occurs. If enabled, the message is discarded;
>>>>>>>>> otherwise the system queues the message for later transmission.
>>>>>>>>> By default, this option is disabled for SOCK_SEQPACKET,
>>>>>>>>> SOCK_STREAM, and SOCK_RDM socket types (resulting in "reliable"
>>>>>>>>> data transfer), and enabled for SOCK_DGRAM (resulting in
>>>>>>>>> "unreliable" data transfer).
>>>>>>>>> =======================================================
>>>>>>>>>
>>>>>>>>> Now I will try to create OVERLOAD case and update you soon my
>>>>>>>>> latest observations.
>>>>>>>>>
>>>>>>>>> -AVM
>>>>>>>>>
>>>>>>>>>> Correcting this and adding an abort is not backward compatible
>>>>>>>>>> as some service already handle flow control in some way, only
>>>>>>>>>> log when packages are dropped.
>>>>>>>>>> Regarding ticket #1960 there are other solutions than
>>>>>>>>>> introducing flow control in MDS, e.g. expose an option to the
>>>>>>>>>> service to choose connection oriented or connection less.
>>>>>>>>>> The problem with dropped messages seems in one case related to,
>>>>>>>>>> (by MDS), intensive MDS logging.
>>>>>>>>>>
>>>>>>>>>> /Thanks HansN
>>>>>>>>>> -----Original Message-----
>>>>>>>>>> From: A V Mahesh [mailto:mahesh.va...@oracle.com]
>>>>>>>>>> Sent: den 23 augusti 2016 11:27
>>>>>>>>>> To: Hans Nordebäck <hans.nordeb...@ericsson.com>; Anders Widell
>>>>>>>>>> <anders.wid...@ericsson.com>; mathi.naic...@oracle.com
>>>>>>>>>> Cc: opensaf-devel@lists.sourceforge.net
>>>>>>>>>> Subject: Re: [PATCH 1 of 1] MDS: Log TIPC dropped messages
>>>>>>>>>> [#1957]
>>>>>>>>>>
>>>>>>>>>> Hi HansN,
>>>>>>>>>>
>>>>>>>>>> It seems I am missing some thing , please allow me to under
>>>>>>>>>> stand
>>>>>>>>>>
>>>>>>>>>> If I currently understand you observation :
>>>>>>>>>>
>>>>>>>>>> With current Opensaf code ( this #1957 patch NOT applied ) , by
>>>>>>>>>> default TIPC_DROPPABLE=true ,while running Opensaf with that
>>>>>>>>>> binary when TIPC_ERR_OVERLOAD  occurring, TIPC is not given
>>>>>>>>>> errors TIPC_ERRINFO or  TIPC_RETDATA and following code is not
>>>>>>>>>> being get hit of function recvfrom_connectionless(), is my
>>>>>>>>>> understanding right ?
>>>>>>>>>>
>>>>>>>>>> ===============================================================
>>>>>>>>>> ======
>>>>>>>>>>
>>>>>>>>>> ========================================
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> *if (anc->cmsg_type == TIPC_ERRINFO) {*
>>>>>>>>>>          /* TIPC_ERRINFO - TIPC error code associated with a
>>>>>>>>>> returned data message or a connection termination message  so
>>>>>>>>>> abort */
>>>>>>>>>>          m_MDS_LOG_CRITICAL("MDTM: undelivered message condition
>>>>>>>>>> ancillary
>>>>>>>>>> data: TIPC_ERRINFO abort err :%s", strerror(errno) );
>>>>>>>>>> *abort();*
>>>>>>>>>> *} else if (anc->cmsg_type == TIPC_RETDATA) {*
>>>>>>>>>>          /* If we set TIPC_DEST_DROPPABLE off messge (configure
>>>>>>>>>> TIPC to return rejected messages to the sender )
>>>>>>>>>>             we will hit this when we implement MDS retransmit
>>>>>>>>>> lost messages abort can be replaced with flow control logic*/
>>>>>>>>>>          for (i = anc->cmsg_len - sizeof(*anc); i > 0; i--) {
>>>>>>>>>>              m_MDS_LOG_DBG("MDTM: returned byte 0x%02x\n", *cptr);
>>>>>>>>>>              cptr++;
>>>>>>>>>>          }
>>>>>>>>>>          /* TIPC_RETDATA -The contents of a returned data message
>>>>>>>>>> so abort */
>>>>>>>>>>          m_MDS_LOG_CRITICAL("MDTM: undelivered message condition
>>>>>>>>>> ancillary
>>>>>>>>>> data: TIPC_RETDATA abort err :%s", strerror(errno) );
>>>>>>>>>> *abort();*
>>>>>>>>>> }
>>>>>>>>>>
>>>>>>>>>> ===============================================================
>>>>>>>>>> ======
>>>>>>>>>>
>>>>>>>>>> ========================================
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> -AVM
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 8/23/2016 1:08 PM, Hans Nordebäck wrote:
>>>>>>>>>>> Hi Mahesh,
>>>>>>>>>>>
>>>>>>>>>>> Please see response below with [HansN] /Thanks HansN
>>>>>>>>>>>
>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>> From: A V Mahesh [mailto:mahesh.va...@oracle.com]
>>>>>>>>>>> Sent: den 23 augusti 2016 08:25
>>>>>>>>>>> To: Hans Nordebäck <hans.nordeb...@ericsson.com>; Anders
>>>>>>>>>>> Widell <anders.wid...@ericsson.com>; mathi.naic...@oracle.com
>>>>>>>>>>> Cc: opensaf-devel@lists.sourceforge.net
>>>>>>>>>>> Subject: Re: [PATCH 1 of 1] MDS: Log TIPC dropped messages
>>>>>>>>>>> [#1957]
>>>>>>>>>>>
>>>>>>>>>>> Hi HansN
>>>>>>>>>>>
>>>>>>>>>>> Please see response below with [AVM]
>>>>>>>>>>>
>>>>>>>>>>> -AVM
>>>>>>>>>>>
>>>>>>>>>>> On 8/23/2016 11:41 AM, Hans Nordebäck wrote:
>>>>>>>>>>>> Hi Mahesh,
>>>>>>>>>>>>
>>>>>>>>>>>> please see comments below.
>>>>>>>>>>>>
>>>>>>>>>>>> /Thanks HansN
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On 08/23/2016 07:21 AM, A V Mahesh wrote:
>>>>>>>>>>>>> Hi HansN,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Let us fist discuss the error handling and abort, then we
>>>>>>>>>>>>> can come back to interpretation of  TIPC currently does
>>>>>>>>>>>>> permit  OR does not permit an application to send a
>>>>>>>>>>>>> multicast message with the "destination droppable" setting
>>>>>>>>>>>>> disabled.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Let us disable TIPC_DEST_DROPPABLE, so that TIPC will try to
>>>>>>>>>>>>> return an undelivered multicast message to its sender and we
>>>>>>>>>>>>> can determine issue is  because of TIPC_ERR_OVERLOAD, this
>>>>>>>>>>>>> helps in debugging , so that application may increased
>>>>>>>>>>>>> SO_SNDBUF/SO_RCVBUF to reduce the problem.
>>>>>>>>>>>>>
>>>>>>>>>>>>> But still we need to abort(), the reason for that is current
>>>>>>>>>>>>> MDS implementations doesn't have flow control logic ( no
>>>>>>>>>>>>> retry because of error ) , so Application like AMF can go
>>>>>>>>>>>>> wrong and cluster will go into unstable/recoverble state.
>>>>>>>>>>>>>
>>>>>>>>>>>> [HansN] In the current implementation messages are dropped
>>>>>>>>>>>> silently and no abort is done.
>>>>>>>>>>> [AVM]  I can see  abort(); in current code , you mean abort();
>>>>>>>>>>> is not working and application(amf) is not existing ?
>>>>>>>>>>> [HansN] In case of TIPC_DROPPABLE=true and messages are
>>>>>>>>>>> dropped,
>>>>>>>>>>> (TIPC_ERR_OVERLOAD)  no abort is be performed, e.g amfd
>>>>>>>>>>> detects this in the msg sanity chk and logs "invalid msg id
>>>>>>>>>>> ..."
>>>>>>>>>>> ==============================================================
>>>>>>>>>>> ======
>>>>>>>>>>>
>>>>>>>>>>> ==
>>>>>>>>>>> ======
>>>>>>>>>>> if (anc->cmsg_type == TIPC_ERRINFO) {
>>>>>>>>>>>           /* TIPC_ERRINFO - TIPC error code associated with a
>>>>>>>>>>> returned data message or a connection termination message so
>>>>>>>>>>> abort */
>>>>>>>>>>>           m_MDS_LOG_CRITICAL("MDTM: undelivered message
>>>>>>>>>>> condition ancillary
>>>>>>>>>>> data: TIPC_ERRINFO abort err :%s", strerror(errno) );
>>>>>>>>>>> *abort();*
>>>>>>>>>>> } else if (anc->cmsg_type == TIPC_RETDATA) {
>>>>>>>>>>>           /* If we set TIPC_DEST_DROPPABLE off messge (configure
>>>>>>>>>>> TIPC to return rejected messages to the sender )
>>>>>>>>>>>              we will hit this when we implement MDS retransmit
>>>>>>>>>>> lost messages abort can be replaced with flow control logic*/
>>>>>>>>>>>           for (i = anc->cmsg_len - sizeof(*anc); i > 0; i--) {
>>>>>>>>>>>               m_MDS_LOG_DBG("MDTM: returned byte 0x%02x\n", *cptr);
>>>>>>>>>>>               cptr++;
>>>>>>>>>>>           }
>>>>>>>>>>>           /* TIPC_RETDATA -The contents of a returned data
>>>>>>>>>>> message  so abort */
>>>>>>>>>>>           m_MDS_LOG_CRITICAL("MDTM: undelivered message
>>>>>>>>>>> condition ancillary
>>>>>>>>>>> data: TIPC_RETDATA abort err :%s", strerror(errno) );
>>>>>>>>>>> *abort();*
>>>>>>>>>>> }
>>>>>>>>>>> ==============================================================
>>>>>>>>>>> ======
>>>>>>>>>>>
>>>>>>>>>>> ==
>>>>>>>>>>> ======
>>>>>>>>>>>> This patch enables logging
>>>>>>>>>>>> when packages are dropped to help in debugging. I don't agree
>>>>>>>>>>>> that we should also introduce abort, but instead:
>>>>>>>>>>>> 1) Implement a solution to handle dropped packages, ticket
>>>>>>>>>>>> #1960
>>>>>>>>>>> [AVM]  This is nothing but flow control implementation in MDS,
>>>>>>>>>>> this is future enhancement
>>>>>>>>>>>
>>>>>>>>>>>> 2) Investigate why packages may be dropped, the receiving MDS
>>>>>>>>>>>> thread is a real time thread and should be able to consume a
>>>>>>>>>>>> large amount of incoming messages.
>>>>>>>>>>>> E.g. is the receiving MDS thread "live hanging" due to locks,
>>>>>>>>>>>> file I/O etc?
>>>>>>>>>>>>> This was the reason we haven't gone for it while addressing
>>>>>>>>>>>>> Ticket
>>>>>>>>>>>>> #1227
>>>>>>>>>>>>> (https://sourceforge.net/p/opensaf/mailman/message/33207717/
>>>>>>>>>>>>> ) So currently we don't have any advantage of disabling
>>>>>>>>>>>>> TIPC_DEST_DROPPABLE and not allowing multicast messages.
>>>>>>>>>>>>>
>>>>>>>>>>>>> -AVM
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 8/18/2016 2:43 PM, Hans Nordeback wrote:
>>>>>>>>>>>>>> osaf/libs/core/mds/mds_dt_tipc.c |  32
>>>>>>>>>>>>>> +++++++++++++++++++++++++-------
>>>>>>>>>>>>>>        1 files changed, 25 insertions(+), 7 deletions(-)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> diff --git a/osaf/libs/core/mds/mds_dt_tipc.c
>>>>>>>>>>>>>> b/osaf/libs/core/mds/mds_dt_tipc.c
>>>>>>>>>>>>>> --- a/osaf/libs/core/mds/mds_dt_tipc.c
>>>>>>>>>>>>>> +++ b/osaf/libs/core/mds/mds_dt_tipc.c
>>>>>>>>>>>>>> @@ -320,6 +320,15 @@ uint32_t mdtm_tipc_init(NODE_ID nodeid,
>>>>>>>>>>>>>>                        m_MDS_LOG_INFO("MDTM: Successfully set
>>>>>>>>>>>>>> default socket option TIPC_IMP = %d", TIPCIMPORTANCE);
>>>>>>>>>>>>>>                }
>>>>>>>>>>>>>>        +        int droppable = 0;
>>>>>>>>>>>>>> +        if (setsockopt(tipc_cb.BSRsock, SOL_TIPC,
>>>>>>>>>>>>>> TIPC_DEST_DROPPABLE, &droppable, sizeof(droppable)) != 0) {
>>>>>>>>>>>>>> +                LOG_ER("MDTM: Can't set
>>>>>>>>>>>>>> TIPC_DEST_DROPPABLE to
>>>>>>>>>>>>>> + zero
>>>>>>>>>>>>>> err :%s\n", strerror(errno));
>>>>>>>>>>>>>> +                m_MDS_LOG_ERR("MDTM: Can't set
>>>>>>>>>>>>>> + TIPC_DEST_DROPPABLE
>>>>>>>>>>>>>> to zero err :%s\n", strerror(errno));
>>>>>>>>>>>>>> +                osafassert(0);
>>>>>>>>>>>>>> +        } else {
>>>>>>>>>>>>>> +                m_MDS_LOG_NOTIFY("MDTM: Successfully set
>>>>>>>>>>>>>> TIPC_DEST_DROPPABLE to zero");
>>>>>>>>>>>>>> +        }
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>            return NCSCC_RC_SUCCESS;
>>>>>>>>>>>>>>        }
>>>>>>>>>>>>>>        @@ -563,6 +572,8 @@ ssize_t recvfrom_connectionless
>>>>>>>>>>>>>> (int sd,
>>>>>>>>>>>>>>            unsigned char *cptr;
>>>>>>>>>>>>>>            int i;
>>>>>>>>>>>>>>            int has_addr;
>>>>>>>>>>>>>> +    int anc_data[2];
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>            ssize_t sz;
>>>>>>>>>>>>>>              has_addr = (from != NULL) && (addrlen != NULL);
>>>>>>>>>>>>>> @@
>>>>>>>>>>>>>> -591,19
>>>>>>>>>>>>>> +602,26 @@ ssize_t recvfrom_connectionless (int sd,
>>>>>>>>>>>>>>                       if the message was sent using a TIPC
>>>>>>>>>>>>>> name or name sequence as the
>>>>>>>>>>>>>>                       destination rather than a TIPC port ID
>>>>>>>>>>>>>> So abort for TIPC_ERRINFO and TIPC_RETDATA*/
>>>>>>>>>>>>>>                    if (anc->cmsg_type == TIPC_ERRINFO) {
>>>>>>>>>>>>>> -                /* TIPC_ERRINFO - TIPC error code
>>>>>>>>>>>>>> associated with a
>>>>>>>>>>>>>> returned data message or a connection termination message
>>>>>>>>>>>>>> so abort */
>>>>>>>>>>>>>> -                m_MDS_LOG_CRITICAL("MDTM: undelivered message
>>>>>>>>>>>>>> condition ancillary data: TIPC_ERRINFO abort err :%s",
>>>>>>>>>>>>>> strerror(errno) );
>>>>>>>>>>>>>> -                abort();
>>>>>>>>>>>>>> +                anc_data[0] = *((unsigned
>>>>>>>>>>>>>> int*)(CMSG_DATA(anc) +
>>>>>>>>>>>>>> 0));
>>>>>>>>>>>>>> +                if (anc_data[0] == TIPC_ERR_OVERLOAD) {
>>>>>>>>>>>>>> +                    LOG_CR("MDTM: undelivered message
>>>>>>>>>>>>>> condition
>>>>>>>>>>>>>> ancillary data: TIPC_ERR_OVERLOAD");
>>>>>>>>>>>>>> + m_MDS_LOG_CRITICAL("MDTM: undelivered message
>>>>>>>>>>>>>> condition ancillary data: TIPC_ERR_OVERLOAD");
>>>>>>>>>>>>>> +                } else {
>>>>>>>>>>>>>> +                    /* TIPC_ERRINFO - TIPC error code
>>>>>>>>>>>>>> associated
>>>>>>>>>>>>>> with a returned data message or a connection termination
>>>>>>>>>>>>>> message so abort */
>>>>>>>>>>>>>> +                    LOG_CR("MDTM: undelivered message
>>>>>>>>>>>>>> condition
>>>>>>>>>>>>>> ancillary data: TIPC_ERRINFO abort err : %d", anc_data[0]);
>>>>>>>>>>>>>> + m_MDS_LOG_CRITICAL("MDTM: undelivered message
>>>>>>>>>>>>>> condition ancillary data: TIPC_ERRINFO abort err : %d",
>>>>>>>>>>>>>> anc_data[0]);
>>>>>>>>>>>>>> +                }
>>>>>>>>>>>>>>                    } else if (anc->cmsg_type == TIPC_RETDATA) {
>>>>>>>>>>>>>> -                /* If we set TIPC_DEST_DROPPABLE off messge
>>>>>>>>>>>>>> (configure TIPC to return rejected messages to the sender )
>>>>>>>>>>>>>> +                /* If we set TIPC_DEST_DROPPABLE off
>>>>>>>>>>>>>> + message
>>>>>>>>>>>>>> (configure TIPC to return rejected messages to the sender )
>>>>>>>>>>>>>>                           we will hit this when we implement
>>>>>>>>>>>>>> MDS retransmit lost messages  abort can be replaced with
>>>>>>>>>>>>>> flow control logic*/
>>>>>>>>>>>>>>                        for (i = anc->cmsg_len - sizeof(*anc);
>>>>>>>>>>>>>> i > 0;
>>>>>>>>>>>>>> i--) {
>>>>>>>>>>>>>> -                    m_MDS_LOG_DBG("MDTM: returned byte
>>>>>>>>>>>>>> 0x%02x\n",
>>>>>>>>>>>>>> *cptr);
>>>>>>>>>>>>>> +                    LOG_CR("MDTM: returned byte 0x%02x\n",
>>>>>>>>>>>>>> *cptr);
>>>>>>>>>>>>>> + m_MDS_LOG_CRITICAL("MDTM: returned byte
>>>>>>>>>>>>>> 0x%02x\n", *cptr);
>>>>>>>>>>>>>>                            cptr++;
>>>>>>>>>>>>>>                        }
>>>>>>>>>>>>>>                        /* TIPC_RETDATA -The contents of a
>>>>>>>>>>>>>> returned data message  so abort */
>>>>>>>>>>>>>> -                m_MDS_LOG_CRITICAL("MDTM: undelivered message
>>>>>>>>>>>>>> condition ancillary data: TIPC_RETDATA abort err :%s",
>>>>>>>>>>>>>> strerror(errno) );
>>>>>>>>>>>>>> -                abort();
>>>>>>>>>>>>>> +                LOG_CR("MDTM: undelivered message
>>>>>>>>>>>>>> + condition
>>>>>>>>>>>>>> ancillary data: TIPC_RETDATA");
>>>>>>>>>>>>>> +                m_MDS_LOG_CRITICAL("MDTM: undelivered
>>>>>>>>>>>>>> + message
>>>>>>>>>>>>>> condition ancillary data: TIPC_RETDATA");
>>>>>>>>>>>>>>                    } else if (anc->cmsg_type == TIPC_DESTNAME) {
>>>>>>>>>>>>>>                        if (sz == 0) {
>>>>>>>>>>>>>> m_MDS_LOG_DBG("MDTM: recd bytes=0 on received on sock,
>>>>>>>>>>>>>> abnormal/unknown condition. Ignoring");
>
> ------------------------------------------------------------------------------
> _______________________________________________
> Opensaf-devel mailing list
> Opensaf-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/opensaf-devel


------------------------------------------------------------------------------
_______________________________________________
Opensaf-devel mailing list
Opensaf-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-devel

Reply via email to