Hi HansN,
  

On 8/23/2016 5:22 PM, Hans Nordebäck wrote:

> Hi Mahesh,
>
> Yes, this is my understanding too, if TIPC_DROPPABLE = true tipc may drop 
> messages silently,  at receive sock buffer full condition,  but do not return 
> any ancillary message.
> If TIPC_DROPPABLE = false tipc may drop message but will send an ancillary 
> message to inform about TIPC_ERR_OVERLOAD.
[AVM]

My observation are understanding is different, based on TIPC code and  
Linux TIPC 2.0 Programmer's Guide ,
that the TIPC_ERR_OVERLOAD error returned when TIPC is unable to enqueue 
an incoming message on the receiving socket's receive queue
irrelevant of TIPC_DEST_DROPPABLE enabled or disabled.

The only difference between TIPC_DEST_DROPPABLE enabled or disabled is , 
If  TIPC_DEST_DROPPABLE enabled, the message is discarded and recvmsg() 
returned size is ZERO and application will get errors,
if TIPC_DEST_DROPPABLE disabled  the message is returned to the sender 
it means the recvmsg() returned size is user send data size and 
application will get errors .

I did check the TIPC code and documentations  and I haven't  get any 
evidences that  TIPC_ERR_OVERLOAD error code will be send only
If TIPC_DEST_DROPPABLE = false.

Even while testing #1227 
(https://sourceforge.net/p/opensaf/mailman/message/33207717/) my 
observations and understanding was,
an individual TIPC socket is only allowed to queue up 
OVERLOAD_LIMIT_BASE/2 messages of the lowest importance level before it 
starts rejecting them.
Once a socket receiving queue length exceeds the maximum limit value, 
the receiving socket will send out a reject message  with 
TIPC_ERR_OVERLOAD error code
with cmsg_type as TIPC_ERRINFO/TIPC_RETDATA, and the tipc code  and 
Linux TIPC 2.0 Programmer's Guide  confirmed the same .

tipc/socket.c
=======================================================
/* Reject message if there isn't room to queue it */

recv_q_len = (u32)atomic_read(&tipc_queue_size);
if (unlikely(recv_q_len >= OVERLOAD_LIMIT_BASE)) {
     if (rx_queue_full(msg, recv_q_len, OVERLOAD_LIMIT_BASE))
         return TIPC_ERR_OVERLOAD;
}
recv_q_len = skb_queue_len(&sk->sk_receive_queue);
if (unlikely(recv_q_len >= (OVERLOAD_LIMIT_BASE / 2))) {
     if (rx_queue_full(msg, recv_q_len, OVERLOAD_LIMIT_BASE / 2))
         return TIPC_ERR_OVERLOAD;
}
=======================================================


2.1.17. setsockopt() of  TIPC 2.0 Programmer's Guide
=======================================================
TIPC_DEST_DROPPABLE
This option governs the handling of messages sent by the socket if the 
message cannot be delivered to its destination,
either because the receiver is congested or because the specified 
receiver does not exist. If enabled, the message is discarded; otherwise 
the message is returned to the sender.

By default, this option is disabled for SOCK_SEQPACKET and SOCK_STREAM 
socket types, and enabled for SOCK_RDM and SOCK_DGRAM,
This arrangement ensures proper teardown of failed connections when 
connection-oriented data transfer is used, without increasing the 
complexity of connectionless data transfer.

TIPC_SRC_DROPPABLE
This option governs the handling of messages sent by the socket if link 
congestion occurs. If enabled, the message is discarded; otherwise the 
system queues the message for later transmission.
By default, this option is disabled for SOCK_SEQPACKET, SOCK_STREAM, and 
SOCK_RDM socket types (resulting in "reliable" data transfer), and 
enabled for SOCK_DGRAM (resulting in "unreliable" data transfer).
=======================================================

Now I will try to create OVERLOAD case and update you soon my latest 
observations.

-AVM

> Correcting this and adding an abort is not backward compatible as some 
> service already handle flow control in some way, only log when packages are 
> dropped.
> Regarding ticket #1960 there are other solutions than introducing flow 
> control in MDS, e.g. expose an option to the service to choose connection 
> oriented
> or connection less.
> The problem with dropped messages seems in one case related to, (by MDS), 
> intensive MDS logging.
>
> /Thanks HansN
> -----Original Message-----
> From: A V Mahesh [mailto:mahesh.va...@oracle.com]
> Sent: den 23 augusti 2016 11:27
> To: Hans Nordebäck <hans.nordeb...@ericsson.com>; Anders Widell 
> <anders.wid...@ericsson.com>; mathi.naic...@oracle.com
> Cc: opensaf-devel@lists.sourceforge.net
> Subject: Re: [PATCH 1 of 1] MDS: Log TIPC dropped messages [#1957]
>
> Hi HansN,
>
> It seems I am missing some thing , please allow me to under stand
>
> If I currently understand you observation :
>
> With current Opensaf code ( this #1957 patch NOT applied ) , by default 
> TIPC_DROPPABLE=true ,while running Opensaf with that binary when 
> TIPC_ERR_OVERLOAD  occurring, TIPC is not  given errors TIPC_ERRINFO or  
> TIPC_RETDATA and following code is not being get hit of function 
> recvfrom_connectionless(), is my  understanding right ?
>
> =============================================================================================================
>
> *if (anc->cmsg_type == TIPC_ERRINFO) {*
>       /* TIPC_ERRINFO - TIPC error code associated with a returned data 
> message or a connection termination message  so abort */
>       m_MDS_LOG_CRITICAL("MDTM: undelivered message condition ancillary
> data: TIPC_ERRINFO abort err :%s", strerror(errno) );
> *abort();*
> *} else if (anc->cmsg_type == TIPC_RETDATA) {*
>       /* If we set TIPC_DEST_DROPPABLE off messge (configure TIPC to return 
> rejected messages to the sender )
>          we will hit this when we implement MDS retransmit lost messages 
> abort can be replaced with flow control logic*/
>       for (i = anc->cmsg_len - sizeof(*anc); i > 0; i--) {
>           m_MDS_LOG_DBG("MDTM: returned byte 0x%02x\n", *cptr);
>           cptr++;
>       }
>       /* TIPC_RETDATA -The contents of a returned data message  so abort */
>       m_MDS_LOG_CRITICAL("MDTM: undelivered message condition ancillary
> data: TIPC_RETDATA abort err :%s", strerror(errno) );
> *abort();*
> }
>
> =============================================================================================================
>
> -AVM
>
>
> On 8/23/2016 1:08 PM, Hans Nordebäck wrote:
>> Hi Mahesh,
>>
>> Please see response below with [HansN] /Thanks HansN
>>
>> -----Original Message-----
>> From: A V Mahesh [mailto:mahesh.va...@oracle.com]
>> Sent: den 23 augusti 2016 08:25
>> To: Hans Nordebäck <hans.nordeb...@ericsson.com>; Anders Widell
>> <anders.wid...@ericsson.com>; mathi.naic...@oracle.com
>> Cc: opensaf-devel@lists.sourceforge.net
>> Subject: Re: [PATCH 1 of 1] MDS: Log TIPC dropped messages [#1957]
>>
>> Hi HansN
>>
>> Please see response below with [AVM]
>>
>> -AVM
>>
>> On 8/23/2016 11:41 AM, Hans Nordebäck wrote:
>>> Hi Mahesh,
>>>
>>> please see comments below.
>>>
>>> /Thanks HansN
>>>
>>>
>>> On 08/23/2016 07:21 AM, A V Mahesh wrote:
>>>> Hi HansN,
>>>>
>>>> Let us fist discuss the error handling and abort, then we can come
>>>> back to interpretation of  TIPC currently  does permit  OR does not
>>>> permit an application to send a multicast message with the
>>>> "destination droppable" setting disabled.
>>>>
>>>> Let us disable TIPC_DEST_DROPPABLE, so that  TIPC will try to return
>>>> an undelivered multicast message to its sender and we can  determine
>>>> issue is  because of TIPC_ERR_OVERLOAD, this helps in debugging , so
>>>> that application may increased SO_SNDBUF/SO_RCVBUF to reduce the
>>>> problem.
>>>>
>>>> But still we need to abort(), the reason for that is current MDS
>>>> implementations doesn't have flow control logic ( no retry because
>>>> of error ) , so Application like AMF can go wrong and cluster will
>>>> go into unstable/recoverble state.
>>>>
>>> [HansN] In the current implementation messages are dropped silently
>>> and no abort is done.
>> [AVM]  I can see  abort(); in current code , you mean abort(); is not 
>> working and application(amf) is not existing ?
>> [HansN] In case of TIPC_DROPPABLE=true and messages are dropped,
>> (TIPC_ERR_OVERLOAD)  no abort is be performed, e.g amfd detects this in the 
>> msg sanity chk and logs "invalid msg id ..."
>> ======================================================================
>> ======
>> if (anc->cmsg_type == TIPC_ERRINFO) {
>>        /* TIPC_ERRINFO - TIPC error code associated with a returned data 
>> message or a connection termination message  so abort */
>>        m_MDS_LOG_CRITICAL("MDTM: undelivered message condition
>> ancillary
>> data: TIPC_ERRINFO abort err :%s", strerror(errno) );
>> *abort();*
>> } else if (anc->cmsg_type == TIPC_RETDATA) {
>>        /* If we set TIPC_DEST_DROPPABLE off messge (configure TIPC to return 
>> rejected messages to the sender )
>>           we will hit this when we implement MDS retransmit lost messages 
>> abort can be replaced with flow control logic*/
>>        for (i = anc->cmsg_len - sizeof(*anc); i > 0; i--) {
>>            m_MDS_LOG_DBG("MDTM: returned byte 0x%02x\n", *cptr);
>>            cptr++;
>>        }
>>        /* TIPC_RETDATA -The contents of a returned data message  so abort */
>>        m_MDS_LOG_CRITICAL("MDTM: undelivered message condition
>> ancillary
>> data: TIPC_RETDATA abort err :%s", strerror(errno) );
>> *abort();*
>> }
>> ======================================================================
>> ======
>>> This patch enables logging
>>> when packages are dropped to help in debugging. I don't agree that we
>>> should also introduce abort, but instead:
>>> 1) Implement a solution to handle dropped packages, ticket #1960
>> [AVM]  This is nothing but flow control implementation in MDS, this is
>> future enhancement
>>
>>> 2) Investigate why packages may be dropped, the receiving MDS thread
>>> is a real time thread and should be able to consume a large amount of
>>> incoming messages.
>>> E.g. is the receiving MDS thread "live hanging" due to locks, file
>>> I/O etc?
>>>> This was the reason we haven't gone for it while addressing Ticket
>>>> #1227 (https://sourceforge.net/p/opensaf/mailman/message/33207717/)
>>>> So currently we don't have any advantage of disabling
>>>> TIPC_DEST_DROPPABLE and not allowing multicast  messages.
>>>>
>>>> -AVM
>>>>
>>>>
>>>> On 8/18/2016 2:43 PM, Hans Nordeback wrote:
>>>>>     osaf/libs/core/mds/mds_dt_tipc.c |  32
>>>>> +++++++++++++++++++++++++-------
>>>>>     1 files changed, 25 insertions(+), 7 deletions(-)
>>>>>
>>>>>
>>>>> diff --git a/osaf/libs/core/mds/mds_dt_tipc.c
>>>>> b/osaf/libs/core/mds/mds_dt_tipc.c
>>>>> --- a/osaf/libs/core/mds/mds_dt_tipc.c
>>>>> +++ b/osaf/libs/core/mds/mds_dt_tipc.c
>>>>> @@ -320,6 +320,15 @@ uint32_t mdtm_tipc_init(NODE_ID nodeid,
>>>>>                     m_MDS_LOG_INFO("MDTM: Successfully set default
>>>>> socket option TIPC_IMP = %d", TIPCIMPORTANCE);
>>>>>             }
>>>>>     +        int droppable = 0;
>>>>> +        if (setsockopt(tipc_cb.BSRsock, SOL_TIPC,
>>>>> TIPC_DEST_DROPPABLE, &droppable, sizeof(droppable)) != 0) {
>>>>> +                LOG_ER("MDTM: Can't set TIPC_DEST_DROPPABLE to
>>>>> + zero
>>>>> err :%s\n", strerror(errno));
>>>>> +                m_MDS_LOG_ERR("MDTM: Can't set TIPC_DEST_DROPPABLE
>>>>> to zero err :%s\n", strerror(errno));
>>>>> +                osafassert(0);
>>>>> +        } else {
>>>>> +                m_MDS_LOG_NOTIFY("MDTM: Successfully set
>>>>> TIPC_DEST_DROPPABLE to zero");
>>>>> +        }
>>>>> +
>>>>>         return NCSCC_RC_SUCCESS;
>>>>>     }
>>>>>     @@ -563,6 +572,8 @@ ssize_t recvfrom_connectionless (int sd,
>>>>>         unsigned char *cptr;
>>>>>         int i;
>>>>>         int has_addr;
>>>>> +    int anc_data[2];
>>>>> +
>>>>>         ssize_t sz;
>>>>>           has_addr = (from != NULL) && (addrlen != NULL); @@ -591,19
>>>>> +602,26 @@ ssize_t recvfrom_connectionless (int sd,
>>>>>                    if the message was sent using a TIPC name or name
>>>>> sequence as the
>>>>>                    destination rather than a TIPC port ID So abort
>>>>> for TIPC_ERRINFO and TIPC_RETDATA*/
>>>>>                 if (anc->cmsg_type == TIPC_ERRINFO) {
>>>>> -                /* TIPC_ERRINFO - TIPC error code associated with a
>>>>> returned data message or a connection termination message  so abort */
>>>>> -                m_MDS_LOG_CRITICAL("MDTM: undelivered message
>>>>> condition ancillary data: TIPC_ERRINFO abort err :%s",
>>>>> strerror(errno) );
>>>>> -                abort();
>>>>> +                anc_data[0] = *((unsigned int*)(CMSG_DATA(anc) + 0));
>>>>> +                if (anc_data[0] == TIPC_ERR_OVERLOAD) {
>>>>> +                    LOG_CR("MDTM: undelivered message condition
>>>>> ancillary data: TIPC_ERR_OVERLOAD");
>>>>> +                    m_MDS_LOG_CRITICAL("MDTM: undelivered message
>>>>> condition ancillary data: TIPC_ERR_OVERLOAD");
>>>>> +                } else {
>>>>> +                    /* TIPC_ERRINFO - TIPC error code associated
>>>>> with a returned data message or a connection termination message
>>>>> so abort */
>>>>> +                    LOG_CR("MDTM: undelivered message condition
>>>>> ancillary data: TIPC_ERRINFO abort err : %d", anc_data[0]);
>>>>> +                    m_MDS_LOG_CRITICAL("MDTM: undelivered message
>>>>> condition ancillary data: TIPC_ERRINFO abort err : %d",
>>>>> anc_data[0]);
>>>>> +                }
>>>>>                 } else if (anc->cmsg_type == TIPC_RETDATA) {
>>>>> -                /* If we set TIPC_DEST_DROPPABLE off messge
>>>>> (configure TIPC to return rejected messages to the sender )
>>>>> +                /* If we set TIPC_DEST_DROPPABLE off message
>>>>> (configure TIPC to return rejected messages to the sender )
>>>>>                        we will hit this when we implement MDS
>>>>> retransmit lost messages  abort can be replaced with flow control
>>>>> logic*/
>>>>>                     for (i = anc->cmsg_len - sizeof(*anc); i > 0; i--) {
>>>>> -                    m_MDS_LOG_DBG("MDTM: returned byte 0x%02x\n",
>>>>> *cptr);
>>>>> +                    LOG_CR("MDTM: returned byte 0x%02x\n", *cptr);
>>>>> +                    m_MDS_LOG_CRITICAL("MDTM: returned byte
>>>>> 0x%02x\n", *cptr);
>>>>>                         cptr++;
>>>>>                     }
>>>>>                     /* TIPC_RETDATA -The contents of a returned data
>>>>> message  so abort */
>>>>> -                m_MDS_LOG_CRITICAL("MDTM: undelivered message
>>>>> condition ancillary data: TIPC_RETDATA abort err :%s",
>>>>> strerror(errno) );
>>>>> -                abort();
>>>>> +                LOG_CR("MDTM: undelivered message condition
>>>>> ancillary data: TIPC_RETDATA");
>>>>> +                m_MDS_LOG_CRITICAL("MDTM: undelivered message
>>>>> condition ancillary data: TIPC_RETDATA");
>>>>>                 } else if (anc->cmsg_type == TIPC_DESTNAME) {
>>>>>                     if (sz == 0) {
>>>>>                         m_MDS_LOG_DBG("MDTM: recd bytes=0 on
>>>>> received on sock, abnormal/unknown  condition. Ignoring");


------------------------------------------------------------------------------
_______________________________________________
Opensaf-devel mailing list
Opensaf-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-devel

Reply via email to