[ https://issues.apache.org/jira/browse/PROTON-157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13508786#comment-13508786 ]
Gordon Sim commented on PROTON-157: ----------------------------------- I believe I now understand how this comes about. The delivery buffer is filled up with outgoing transfers. However, as credit is added by the peer, more deliveries can be queued up in the tpwork queue. Then a disposition comes in for a portion of the previously sent messages. This results in the broker settling those transfers which causes those deliveries to be added back onto the tpwork queue, behind the deliveries yet to be sent. Another delivery may now be added on to the tp work queue. If the pwork queue is then processed, it goes through the first batch of unset deliveries, but there is still no space in the delivery bufer, so these are left as is. It then processes the batch of settled messages, cleans up the delivery buffer and removes them from tpwork. Now finally it gets to the next unsent delivery, added after these. Since the buffer has now been cleaned, this gets assigned the next delivery id. However there are transfers ahead of it in the queue that have *not* yet had a delivery state assigned. The next time tp work is processed, these will get delivery state assigned with the delivery id out of sync. A suggested fix (that works for my case and passes tests) is to detect when we have removed deliveries from tp work, and start processing from the head. This may not be optimal in all cases of course, but it is a simple change: {noformat} Index: proton-c/src/engine/engine.c =================================================================== --- proton-c/src/engine/engine.c (revision 1416552) +++ proton-c/src/engine/engine.c (working copy) @@ -2219,7 +2219,11 @@ pn_clear_tpwork(delivery); } - delivery = delivery->tpwork_next; + if (delivery->tpwork) { + delivery = delivery->tpwork_next; + } else { + delivery = conn->tpwork_head; + } } } {noformat} > invalid delivery-id sent(?) > --------------------------- > > Key: PROTON-157 > URL: https://issues.apache.org/jira/browse/PROTON-157 > Project: Qpid Proton > Issue Type: Bug > Components: proton-c > Affects Versions: 0.2 > Reporter: Gordon Sim > > E.g. the following trace is from the client side (the peer was qpidd) and > appears to show a delivery-id being skipped: > [0xb2a800:1] <- TRANSFER @20 [1, 1520, b"\xf0\x05\x00\x00\x00\x00\x00\x00", > 0, false, false] (177) > "\x00Sp\xc0\x08\x05BP\x00@@R\x01\x00Ss\xd0\x00\x00\x00\x11\x00\x00\x00\x0d@@@@@@@@@@@@@\x00St\xd1\x00\x00\x00\x1a\x00\x00\x00\x04\xa1\x02snp\x00\x00\x16\xbd\xa1\x02ts\x80\x12\xc8\xfb\xe4\xdb\xd1\xc5\xdf\x00Su\xa0dXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX" > [0xb2a800:1] <- TRANSFER @20 [1, 1521, b"\xf1\x05\x00\x00\x00\x00\x00\x00", > 0, false, false] (177) > "\x00Sp\xc0\x08\x05BP\x00@@R\x01\x00Ss\xd0\x00\x00\x00\x11\x00\x00\x00\x0d@@@@@@@@@@@@@\x00St\xd1\x00\x00\x00\x1a\x00\x00\x00\x04\xa1\x02snp\x00\x00\x16\xbe\xa1\x02ts\x80\x12\xc8\xfb\xe4\xdb\xd1\xe8c\x00Su\xa0dXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX" > [0xb2a800:1] <- TRANSFER @20 [1, 1522, b"\xf2\x05\x00\x00\x00\x00\x00\x00", > 0, false, false] (177) > "\x00Sp\xc0\x08\x05BP\x00@@R\x01\x00Ss\xd0\x00\x00\x00\x11\x00\x00\x00\x0d@@@@@@@@@@@@@\x00St\xd1\x00\x00\x00\x1a\x00\x00\x00\x04\xa1\x02snp\x00\x00\x16\xbf\xa1\x02ts\x80\x12\xc8\xfb\xe4\xdb\xd2\x8b\x95\x00Su\xa0dXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX" > [0xb2a800:1] <- TRANSFER @20 [1, 1523, b"\xf3\x05\x00\x00\x00\x00\x00\x00", > 0, false, false] (177) > "\x00Sp\xc0\x08\x05BP\x00@@R\x01\x00Ss\xd0\x00\x00\x00\x11\x00\x00\x00\x0d@@@@@@@@@@@@@\x00St\xd1\x00\x00\x00\x1a\x00\x00\x00\x04\xa1\x02snp\x00\x00\x16\xc0\xa1\x02ts\x80\x12\xc8\xfb\xe4\xdb\xd2\xa8\xa5\x00Su\xa0dXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX" > [0xb2a800:1] <- TRANSFER @20 [1, 1525, b"\xf4\x05\x00\x00\x00\x00\x00\x00", > 0, false, false] (177) > "\x00Sp\xc0\x08\x05BP\x00@@R\x01\x00Ss\xd0\x00\x00\x00\x11\x00\x00\x00\x0d@@@@@@@@@@@@@\x00St\xd1\x00\x00\x00\x1a\x00\x00\x00\x04\xa1\x02snp\x00\x00\x16\xc1\xa1\x02ts\x80\x12\xc8\xfb\xe4\xdb\xd2\xc01\x00Su\xa0dXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX" > [0xb2a800:0] -> CLOSE @24 [@29 [:"amqp:session:invalid-field"]] > ERROR amqp:session:invalid-field sequencing error, expected delivery-id 1524, > got 1525 > It is of course possible that I am doing something wrong in the broker code, > but I can't think what would cause something like this (since I can't > influence the delivery ids directly). It only seems to happen when there is a > prefetch greater than 900. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira