Hi HansN,

Sorry for the delay.

I will test it and get back to you soon.

-AVM


On 8/31/2016 4:29 PM, Hans Nordebäck wrote:
> Hi Mahesh,
> Any updates on this?
>
> /Regards HansN
>
> -----Original Message-----
> From: Anders Widell
> Sent: den 25 augusti 2016 13:11
> To: A V Mahesh <mahesh.va...@oracle.com>; Hans Nordebäck 
> <hans.nordeb...@ericsson.com>; mathi.naic...@oracle.com
> Cc: opensaf-devel@lists.sourceforge.net
> Subject: Re: [PATCH 1 of 1] MDS: Log TIPC dropped messages [#1957]
>
> Hi!
>
> This is what the TIPC user documentation says about TIPC_DEST_DROPPABLE:
> "This option governs the handling of messages sent by the socket if the 
> message cannot be delivered to its destination, either because the receiver 
> is congested or because the specified receiver does not exist.
> If enabled, the message is discarded; otherwise the message is returned to 
> the sender."
>
> This is what the TIPC user documentation says about the return value from the 
> recvmsg() system call: "When used with a connectionless socket, a return 
> value of 0 indicates the arrival of a returned data message that was 
> originally sent by this socket."
>
> I think the documentation is pretty clear. If you set TIPC_DEST_DROPPABLE to 
> true, the receiver can discard messages e.g. when the receive buffer is full. 
> The sender will not be notified in this case. If TIPC_DEST_DROPPABLE is set 
> to false, the message will be returned to the sender in case of a full 
> receive buffer. The sender knows that it has received such a returned message 
> when the recvmsg() call returns zero.
>
> regards,
> Anders Widell
>
> On 08/25/2016 11:30 AM, A V Mahesh wrote:
>> Hi HansN,
>>
>>
>> On 8/23/2016 5:22 PM, Hans Nordebäck wrote:
>>
>>> Hi Mahesh,
>>>
>>> Yes, this is my understanding too, if TIPC_DROPPABLE = true tipc may
>>> drop messages silently,  at receive sock buffer full condition,  but
>>> do not return any ancillary message.
>>> If TIPC_DROPPABLE = false tipc may drop message but will send an
>>> ancillary message to inform about TIPC_ERR_OVERLOAD.
>> [AVM]
>>
>> My observation are understanding is different, based on TIPC code and
>> Linux TIPC 2.0 Programmer's Guide , that the TIPC_ERR_OVERLOAD error
>> returned when TIPC is unable to enqueue an incoming message on the
>> receiving socket's receive queue irrelevant of TIPC_DEST_DROPPABLE
>> enabled or disabled.
>>
>> The only difference between TIPC_DEST_DROPPABLE enabled or disabled is
>> , If  TIPC_DEST_DROPPABLE enabled, the message is discarded and
>> recvmsg() returned size is ZERO and application will get errors, if
>> TIPC_DEST_DROPPABLE disabled  the message is returned to the sender it
>> means the recvmsg() returned size is user send data size and
>> application will get errors .
>>
>> I did check the TIPC code and documentations  and I haven't  get any
>> evidences that  TIPC_ERR_OVERLOAD error code will be send only If
>> TIPC_DEST_DROPPABLE = false.
>>
>> Even while testing #1227
>> (https://sourceforge.net/p/opensaf/mailman/message/33207717/) my
>> observations and understanding was, an individual TIPC socket is only
>> allowed to queue up
>> OVERLOAD_LIMIT_BASE/2 messages of the lowest importance level before
>> it starts rejecting them.
>> Once a socket receiving queue length exceeds the maximum limit value,
>> the receiving socket will send out a reject message  with
>> TIPC_ERR_OVERLOAD error code with cmsg_type as
>> TIPC_ERRINFO/TIPC_RETDATA, and the tipc code and Linux TIPC 2.0
>> Programmer's Guide  confirmed the same .
>>
>> tipc/socket.c
>> =======================================================
>> /* Reject message if there isn't room to queue it */
>>
>> recv_q_len = (u32)atomic_read(&tipc_queue_size);
>> if (unlikely(recv_q_len >= OVERLOAD_LIMIT_BASE)) {
>>      if (rx_queue_full(msg, recv_q_len, OVERLOAD_LIMIT_BASE))
>>          return TIPC_ERR_OVERLOAD;
>> }
>> recv_q_len = skb_queue_len(&sk->sk_receive_queue);
>> if (unlikely(recv_q_len >= (OVERLOAD_LIMIT_BASE / 2))) {
>>      if (rx_queue_full(msg, recv_q_len, OVERLOAD_LIMIT_BASE / 2))
>>          return TIPC_ERR_OVERLOAD;
>> }
>> =======================================================
>>
>>
>> 2.1.17. setsockopt() of  TIPC 2.0 Programmer's Guide
>> =======================================================
>> TIPC_DEST_DROPPABLE
>> This option governs the handling of messages sent by the socket if the
>> message cannot be delivered to its destination, either because the
>> receiver is congested or because the specified receiver does not
>> exist. If enabled, the message is discarded; otherwise the message is
>> returned to the sender.
>>
>> By default, this option is disabled for SOCK_SEQPACKET and SOCK_STREAM
>> socket types, and enabled for SOCK_RDM and SOCK_DGRAM, This
>> arrangement ensures proper teardown of failed connections when
>> connection-oriented data transfer is used, without increasing the
>> complexity of connectionless data transfer.
>>
>> TIPC_SRC_DROPPABLE
>> This option governs the handling of messages sent by the socket if
>> link congestion occurs. If enabled, the message is discarded;
>> otherwise the system queues the message for later transmission.
>> By default, this option is disabled for SOCK_SEQPACKET, SOCK_STREAM,
>> and SOCK_RDM socket types (resulting in "reliable" data transfer), and
>> enabled for SOCK_DGRAM (resulting in "unreliable" data transfer).
>> =======================================================
>>
>> Now I will try to create OVERLOAD case and update you soon my latest
>> observations.
>>
>> -AVM
>>
>>> Correcting this and adding an abort is not backward compatible as
>>> some service already handle flow control in some way, only log when
>>> packages are dropped.
>>> Regarding ticket #1960 there are other solutions than introducing
>>> flow control in MDS, e.g. expose an option to the service to choose
>>> connection oriented or connection less.
>>> The problem with dropped messages seems in one case related to, (by
>>> MDS), intensive MDS logging.
>>>
>>> /Thanks HansN
>>> -----Original Message-----
>>> From: A V Mahesh [mailto:mahesh.va...@oracle.com]
>>> Sent: den 23 augusti 2016 11:27
>>> To: Hans Nordebäck <hans.nordeb...@ericsson.com>; Anders Widell
>>> <anders.wid...@ericsson.com>; mathi.naic...@oracle.com
>>> Cc: opensaf-devel@lists.sourceforge.net
>>> Subject: Re: [PATCH 1 of 1] MDS: Log TIPC dropped messages [#1957]
>>>
>>> Hi HansN,
>>>
>>> It seems I am missing some thing , please allow me to under stand
>>>
>>> If I currently understand you observation :
>>>
>>> With current Opensaf code ( this #1957 patch NOT applied ) , by
>>> default TIPC_DROPPABLE=true ,while running Opensaf with that binary
>>> when TIPC_ERR_OVERLOAD  occurring, TIPC is not  given errors
>>> TIPC_ERRINFO or  TIPC_RETDATA and following code is not being get hit
>>> of function recvfrom_connectionless(), is my understanding right ?
>>>
>>> =====================================================================
>>> ========================================
>>>
>>>
>>> *if (anc->cmsg_type == TIPC_ERRINFO) {*
>>>        /* TIPC_ERRINFO - TIPC error code associated with a returned
>>> data message or a connection termination message  so abort */
>>>        m_MDS_LOG_CRITICAL("MDTM: undelivered message condition
>>> ancillary
>>> data: TIPC_ERRINFO abort err :%s", strerror(errno) );
>>> *abort();*
>>> *} else if (anc->cmsg_type == TIPC_RETDATA) {*
>>>        /* If we set TIPC_DEST_DROPPABLE off messge (configure TIPC to
>>> return rejected messages to the sender )
>>>           we will hit this when we implement MDS retransmit lost
>>> messages abort can be replaced with flow control logic*/
>>>        for (i = anc->cmsg_len - sizeof(*anc); i > 0; i--) {
>>>            m_MDS_LOG_DBG("MDTM: returned byte 0x%02x\n", *cptr);
>>>            cptr++;
>>>        }
>>>        /* TIPC_RETDATA -The contents of a returned data message so
>>> abort */
>>>        m_MDS_LOG_CRITICAL("MDTM: undelivered message condition
>>> ancillary
>>> data: TIPC_RETDATA abort err :%s", strerror(errno) );
>>> *abort();*
>>> }
>>>
>>> =====================================================================
>>> ========================================
>>>
>>>
>>> -AVM
>>>
>>>
>>> On 8/23/2016 1:08 PM, Hans Nordebäck wrote:
>>>> Hi Mahesh,
>>>>
>>>> Please see response below with [HansN] /Thanks HansN
>>>>
>>>> -----Original Message-----
>>>> From: A V Mahesh [mailto:mahesh.va...@oracle.com]
>>>> Sent: den 23 augusti 2016 08:25
>>>> To: Hans Nordebäck <hans.nordeb...@ericsson.com>; Anders Widell
>>>> <anders.wid...@ericsson.com>; mathi.naic...@oracle.com
>>>> Cc: opensaf-devel@lists.sourceforge.net
>>>> Subject: Re: [PATCH 1 of 1] MDS: Log TIPC dropped messages [#1957]
>>>>
>>>> Hi HansN
>>>>
>>>> Please see response below with [AVM]
>>>>
>>>> -AVM
>>>>
>>>> On 8/23/2016 11:41 AM, Hans Nordebäck wrote:
>>>>> Hi Mahesh,
>>>>>
>>>>> please see comments below.
>>>>>
>>>>> /Thanks HansN
>>>>>
>>>>>
>>>>> On 08/23/2016 07:21 AM, A V Mahesh wrote:
>>>>>> Hi HansN,
>>>>>>
>>>>>> Let us fist discuss the error handling and abort, then we can come
>>>>>> back to interpretation of  TIPC currently  does permit  OR does
>>>>>> not permit an application to send a multicast message with the
>>>>>> "destination droppable" setting disabled.
>>>>>>
>>>>>> Let us disable TIPC_DEST_DROPPABLE, so that  TIPC will try to
>>>>>> return an undelivered multicast message to its sender and we can
>>>>>> determine issue is  because of TIPC_ERR_OVERLOAD, this helps in
>>>>>> debugging , so that application may increased SO_SNDBUF/SO_RCVBUF
>>>>>> to reduce the problem.
>>>>>>
>>>>>> But still we need to abort(), the reason for that is current MDS
>>>>>> implementations doesn't have flow control logic ( no retry because
>>>>>> of error ) , so Application like AMF can go wrong and cluster will
>>>>>> go into unstable/recoverble state.
>>>>>>
>>>>> [HansN] In the current implementation messages are dropped silently
>>>>> and no abort is done.
>>>> [AVM]  I can see  abort(); in current code , you mean abort(); is
>>>> not working and application(amf) is not existing ?
>>>> [HansN] In case of TIPC_DROPPABLE=true and messages are dropped,
>>>> (TIPC_ERR_OVERLOAD)  no abort is be performed, e.g amfd detects this
>>>> in the msg sanity chk and logs "invalid msg id ..."
>>>> ====================================================================
>>>> ==
>>>> ======
>>>> if (anc->cmsg_type == TIPC_ERRINFO) {
>>>>         /* TIPC_ERRINFO - TIPC error code associated with a returned
>>>> data message or a connection termination message  so abort */
>>>>         m_MDS_LOG_CRITICAL("MDTM: undelivered message condition
>>>> ancillary
>>>> data: TIPC_ERRINFO abort err :%s", strerror(errno) );
>>>> *abort();*
>>>> } else if (anc->cmsg_type == TIPC_RETDATA) {
>>>>         /* If we set TIPC_DEST_DROPPABLE off messge (configure TIPC
>>>> to return rejected messages to the sender )
>>>>            we will hit this when we implement MDS retransmit lost
>>>> messages abort can be replaced with flow control logic*/
>>>>         for (i = anc->cmsg_len - sizeof(*anc); i > 0; i--) {
>>>>             m_MDS_LOG_DBG("MDTM: returned byte 0x%02x\n", *cptr);
>>>>             cptr++;
>>>>         }
>>>>         /* TIPC_RETDATA -The contents of a returned data message  so
>>>> abort */
>>>>         m_MDS_LOG_CRITICAL("MDTM: undelivered message condition
>>>> ancillary
>>>> data: TIPC_RETDATA abort err :%s", strerror(errno) );
>>>> *abort();*
>>>> }
>>>> ====================================================================
>>>> ==
>>>> ======
>>>>> This patch enables logging
>>>>> when packages are dropped to help in debugging. I don't agree that
>>>>> we should also introduce abort, but instead:
>>>>> 1) Implement a solution to handle dropped packages, ticket #1960
>>>> [AVM]  This is nothing but flow control implementation in MDS, this
>>>> is future enhancement
>>>>
>>>>> 2) Investigate why packages may be dropped, the receiving MDS
>>>>> thread is a real time thread and should be able to consume a large
>>>>> amount of incoming messages.
>>>>> E.g. is the receiving MDS thread "live hanging" due to locks, file
>>>>> I/O etc?
>>>>>> This was the reason we haven't gone for it while addressing Ticket
>>>>>> #1227
>>>>>> (https://sourceforge.net/p/opensaf/mailman/message/33207717/)
>>>>>> So currently we don't have any advantage of disabling
>>>>>> TIPC_DEST_DROPPABLE and not allowing multicast  messages.
>>>>>>
>>>>>> -AVM
>>>>>>
>>>>>>
>>>>>> On 8/18/2016 2:43 PM, Hans Nordeback wrote:
>>>>>>> osaf/libs/core/mds/mds_dt_tipc.c |  32
>>>>>>> +++++++++++++++++++++++++-------
>>>>>>>      1 files changed, 25 insertions(+), 7 deletions(-)
>>>>>>>
>>>>>>>
>>>>>>> diff --git a/osaf/libs/core/mds/mds_dt_tipc.c
>>>>>>> b/osaf/libs/core/mds/mds_dt_tipc.c
>>>>>>> --- a/osaf/libs/core/mds/mds_dt_tipc.c
>>>>>>> +++ b/osaf/libs/core/mds/mds_dt_tipc.c
>>>>>>> @@ -320,6 +320,15 @@ uint32_t mdtm_tipc_init(NODE_ID nodeid,
>>>>>>>                      m_MDS_LOG_INFO("MDTM: Successfully set
>>>>>>> default socket option TIPC_IMP = %d", TIPCIMPORTANCE);
>>>>>>>              }
>>>>>>>      +        int droppable = 0;
>>>>>>> +        if (setsockopt(tipc_cb.BSRsock, SOL_TIPC,
>>>>>>> TIPC_DEST_DROPPABLE, &droppable, sizeof(droppable)) != 0) {
>>>>>>> +                LOG_ER("MDTM: Can't set TIPC_DEST_DROPPABLE to
>>>>>>> + zero
>>>>>>> err :%s\n", strerror(errno));
>>>>>>> +                m_MDS_LOG_ERR("MDTM: Can't set
>>>>>>> + TIPC_DEST_DROPPABLE
>>>>>>> to zero err :%s\n", strerror(errno));
>>>>>>> +                osafassert(0);
>>>>>>> +        } else {
>>>>>>> +                m_MDS_LOG_NOTIFY("MDTM: Successfully set
>>>>>>> TIPC_DEST_DROPPABLE to zero");
>>>>>>> +        }
>>>>>>> +
>>>>>>>          return NCSCC_RC_SUCCESS;
>>>>>>>      }
>>>>>>>      @@ -563,6 +572,8 @@ ssize_t recvfrom_connectionless (int sd,
>>>>>>>          unsigned char *cptr;
>>>>>>>          int i;
>>>>>>>          int has_addr;
>>>>>>> +    int anc_data[2];
>>>>>>> +
>>>>>>>          ssize_t sz;
>>>>>>>            has_addr = (from != NULL) && (addrlen != NULL); @@
>>>>>>> -591,19
>>>>>>> +602,26 @@ ssize_t recvfrom_connectionless (int sd,
>>>>>>>                     if the message was sent using a TIPC name or
>>>>>>> name sequence as the
>>>>>>>                     destination rather than a TIPC port ID So
>>>>>>> abort for TIPC_ERRINFO and TIPC_RETDATA*/
>>>>>>>                  if (anc->cmsg_type == TIPC_ERRINFO) {
>>>>>>> -                /* TIPC_ERRINFO - TIPC error code associated with a
>>>>>>> returned data message or a connection termination message  so
>>>>>>> abort */
>>>>>>> -                m_MDS_LOG_CRITICAL("MDTM: undelivered message
>>>>>>> condition ancillary data: TIPC_ERRINFO abort err :%s",
>>>>>>> strerror(errno) );
>>>>>>> -                abort();
>>>>>>> +                anc_data[0] = *((unsigned int*)(CMSG_DATA(anc) +
>>>>>>> 0));
>>>>>>> +                if (anc_data[0] == TIPC_ERR_OVERLOAD) {
>>>>>>> +                    LOG_CR("MDTM: undelivered message condition
>>>>>>> ancillary data: TIPC_ERR_OVERLOAD");
>>>>>>> +                    m_MDS_LOG_CRITICAL("MDTM: undelivered
>>>>>>> + message
>>>>>>> condition ancillary data: TIPC_ERR_OVERLOAD");
>>>>>>> +                } else {
>>>>>>> +                    /* TIPC_ERRINFO - TIPC error code associated
>>>>>>> with a returned data message or a connection termination message
>>>>>>> so abort */
>>>>>>> +                    LOG_CR("MDTM: undelivered message condition
>>>>>>> ancillary data: TIPC_ERRINFO abort err : %d", anc_data[0]);
>>>>>>> +                    m_MDS_LOG_CRITICAL("MDTM: undelivered
>>>>>>> + message
>>>>>>> condition ancillary data: TIPC_ERRINFO abort err : %d",
>>>>>>> anc_data[0]);
>>>>>>> +                }
>>>>>>>                  } else if (anc->cmsg_type == TIPC_RETDATA) {
>>>>>>> -                /* If we set TIPC_DEST_DROPPABLE off messge
>>>>>>> (configure TIPC to return rejected messages to the sender )
>>>>>>> +                /* If we set TIPC_DEST_DROPPABLE off message
>>>>>>> (configure TIPC to return rejected messages to the sender )
>>>>>>>                         we will hit this when we implement MDS
>>>>>>> retransmit lost messages  abort can be replaced with flow control
>>>>>>> logic*/
>>>>>>>                      for (i = anc->cmsg_len - sizeof(*anc); i > 0;
>>>>>>> i--) {
>>>>>>> -                    m_MDS_LOG_DBG("MDTM: returned byte 0x%02x\n",
>>>>>>> *cptr);
>>>>>>> +                    LOG_CR("MDTM: returned byte 0x%02x\n", *cptr);
>>>>>>> +                    m_MDS_LOG_CRITICAL("MDTM: returned byte
>>>>>>> 0x%02x\n", *cptr);
>>>>>>>                          cptr++;
>>>>>>>                      }
>>>>>>>                      /* TIPC_RETDATA -The contents of a returned
>>>>>>> data message  so abort */
>>>>>>> -                m_MDS_LOG_CRITICAL("MDTM: undelivered message
>>>>>>> condition ancillary data: TIPC_RETDATA abort err :%s",
>>>>>>> strerror(errno) );
>>>>>>> -                abort();
>>>>>>> +                LOG_CR("MDTM: undelivered message condition
>>>>>>> ancillary data: TIPC_RETDATA");
>>>>>>> +                m_MDS_LOG_CRITICAL("MDTM: undelivered message
>>>>>>> condition ancillary data: TIPC_RETDATA");
>>>>>>>                  } else if (anc->cmsg_type == TIPC_DESTNAME) {
>>>>>>>                      if (sz == 0) {
>>>>>>>                          m_MDS_LOG_DBG("MDTM: recd bytes=0 on
>>>>>>> received on sock, abnormal/unknown  condition. Ignoring");
>


------------------------------------------------------------------------------
_______________________________________________
Opensaf-devel mailing list
Opensaf-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-devel

Reply via email to