Hi Mahesh,

please see comments below.

/Thanks HansN


On 08/23/2016 07:21 AM, A V Mahesh wrote:
> Hi HansN,
>
> Let us fist discuss the error handling and abort, then we can come 
> back to
> interpretation of  TIPC currently  does permit  OR does not permit an 
> application to send
> a multicast message with the "destination droppable" setting disabled.
>
> Let us disable TIPC_DEST_DROPPABLE, so that  TIPC will try to return 
> an undelivered multicast message to its sender
> and we can  determine issue is  because of TIPC_ERR_OVERLOAD, this 
> helps in debugging ,
> so that application may increased SO_SNDBUF/SO_RCVBUF to reduce the 
> problem.
>
> But still we need to abort(), the reason for that is current MDS 
> implementations doesn't
> have flow control logic ( no retry because of error ) , so Application 
> like AMF can go wrong and cluster will go into unstable/recoverble state.
>
[HansN] In the current implementation messages are dropped silently and 
no abort is done. This patch enables logging
when packages are dropped to help in debugging. I don't agree that we 
should also introduce abort, but instead:
1) Implement a solution to handle dropped packages, ticket #1960
2) Investigate why packages may be dropped, the receiving MDS thread is 
a real time thread and should be able to consume a large amount of 
incoming messages.
E.g. is the receiving MDS thread "live hanging" due to locks, file I/O etc?
> This was the reason we haven't gone for it while addressing Ticket 
> #1227 (https://sourceforge.net/p/opensaf/mailman/message/33207717/)
> So currently we don't have any advantage of disabling 
> TIPC_DEST_DROPPABLE and not allowing multicast  messages.
>
> -AVM
>
>
> On 8/18/2016 2:43 PM, Hans Nordeback wrote:
>>   osaf/libs/core/mds/mds_dt_tipc.c |  32 
>> +++++++++++++++++++++++++-------
>>   1 files changed, 25 insertions(+), 7 deletions(-)
>>
>>
>> diff --git a/osaf/libs/core/mds/mds_dt_tipc.c 
>> b/osaf/libs/core/mds/mds_dt_tipc.c
>> --- a/osaf/libs/core/mds/mds_dt_tipc.c
>> +++ b/osaf/libs/core/mds/mds_dt_tipc.c
>> @@ -320,6 +320,15 @@ uint32_t mdtm_tipc_init(NODE_ID nodeid,
>>                   m_MDS_LOG_INFO("MDTM: Successfully set default 
>> socket option TIPC_IMP = %d", TIPCIMPORTANCE);
>>           }
>>   +        int droppable = 0;
>> +        if (setsockopt(tipc_cb.BSRsock, SOL_TIPC, 
>> TIPC_DEST_DROPPABLE, &droppable, sizeof(droppable)) != 0) {
>> +                LOG_ER("MDTM: Can't set TIPC_DEST_DROPPABLE to zero 
>> err :%s\n", strerror(errno));
>> +                m_MDS_LOG_ERR("MDTM: Can't set TIPC_DEST_DROPPABLE 
>> to zero err :%s\n", strerror(errno));
>> +                osafassert(0);
>> +        } else {
>> +                m_MDS_LOG_NOTIFY("MDTM: Successfully set 
>> TIPC_DEST_DROPPABLE to zero");
>> +        }
>> +
>>       return NCSCC_RC_SUCCESS;
>>   }
>>   @@ -563,6 +572,8 @@ ssize_t recvfrom_connectionless (int sd,
>>       unsigned char *cptr;
>>       int i;
>>       int has_addr;
>> +    int anc_data[2];
>> +
>>       ssize_t sz;
>>         has_addr = (from != NULL) && (addrlen != NULL);
>> @@ -591,19 +602,26 @@ ssize_t recvfrom_connectionless (int sd,
>>                  if the message was sent using a TIPC name or name 
>> sequence as the
>>                  destination rather than a TIPC port ID So abort for 
>> TIPC_ERRINFO and TIPC_RETDATA*/
>>               if (anc->cmsg_type == TIPC_ERRINFO) {
>> -                /* TIPC_ERRINFO - TIPC error code associated with a 
>> returned data message or a connection termination message  so abort */
>> -                m_MDS_LOG_CRITICAL("MDTM: undelivered message 
>> condition ancillary data: TIPC_ERRINFO abort err :%s", 
>> strerror(errno) );
>> -                abort();
>> +                anc_data[0] = *((unsigned int*)(CMSG_DATA(anc) + 0));
>> +                if (anc_data[0] == TIPC_ERR_OVERLOAD) {
>> +                    LOG_CR("MDTM: undelivered message condition 
>> ancillary data: TIPC_ERR_OVERLOAD");
>> +                    m_MDS_LOG_CRITICAL("MDTM: undelivered message 
>> condition ancillary data: TIPC_ERR_OVERLOAD");
>> +                } else {
>> +                    /* TIPC_ERRINFO - TIPC error code associated 
>> with a returned data message or a connection termination message  so 
>> abort */
>> +                    LOG_CR("MDTM: undelivered message condition 
>> ancillary data: TIPC_ERRINFO abort err : %d", anc_data[0]);
>> +                    m_MDS_LOG_CRITICAL("MDTM: undelivered message 
>> condition ancillary data: TIPC_ERRINFO abort err : %d", anc_data[0]);
>> +                }
>>               } else if (anc->cmsg_type == TIPC_RETDATA) {
>> -                /* If we set TIPC_DEST_DROPPABLE off messge 
>> (configure TIPC to return rejected messages to the sender )
>> +                /* If we set TIPC_DEST_DROPPABLE off message 
>> (configure TIPC to return rejected messages to the sender )
>>                      we will hit this when we implement MDS 
>> retransmit lost messages  abort can be replaced with flow control 
>> logic*/
>>                   for (i = anc->cmsg_len - sizeof(*anc); i > 0; i--) {
>> -                    m_MDS_LOG_DBG("MDTM: returned byte 0x%02x\n", 
>> *cptr);
>> +                    LOG_CR("MDTM: returned byte 0x%02x\n", *cptr);
>> +                    m_MDS_LOG_CRITICAL("MDTM: returned byte 
>> 0x%02x\n", *cptr);
>>                       cptr++;
>>                   }
>>                   /* TIPC_RETDATA -The contents of a returned data 
>> message  so abort */
>> -                m_MDS_LOG_CRITICAL("MDTM: undelivered message 
>> condition ancillary data: TIPC_RETDATA abort err :%s", 
>> strerror(errno) );
>> -                abort();
>> +                LOG_CR("MDTM: undelivered message condition 
>> ancillary data: TIPC_RETDATA");
>> +                m_MDS_LOG_CRITICAL("MDTM: undelivered message 
>> condition ancillary data: TIPC_RETDATA");
>>               } else if (anc->cmsg_type == TIPC_DESTNAME) {
>>                   if (sz == 0) {
>>                       m_MDS_LOG_DBG("MDTM: recd bytes=0 on received 
>> on sock, abnormal/unknown  condition. Ignoring");
>


------------------------------------------------------------------------------
_______________________________________________
Opensaf-devel mailing list
Opensaf-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-devel

Reply via email to