[ 
https://issues.apache.org/jira/browse/PROTON-2162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16995569#comment-16995569
 ] 

Robbie Gemmell commented on PROTON-2162:
----------------------------------------

I noticed after typing this that youve actually marked this as 'affects 
0.29.0', while I had read the description as being 'the behaviour regressed in 
0.30.0'. So, my entire text might be invalid. Which version are you saying 
there is a regression in? Original text below.


I dont think this characterises quite as exactly as it is described.

I see different behaviour on different runs, seemingly down to the race of what 
messages etc have been sent before the receiver errors out and the connection 
is unceremoniously chopped. I tried several runs with equal versions for the 
peers and with mismatched versions. I mostly see 0.30.0-rc1 generating (not 
receiving, as the description suggests) an outgoing close with connection reset 
by peer condition in much the same manner as 0.29.0 (bar the logging 
differences). I also observed it generating a close without any error when it 
had got them all sent before the connection barfed out.

 0.30.0-rc1 still sending messages when connection goes:
{noformat}
[0x20ea150]: AMQP:FRAME:0 -> @close(24) [error=@error(29) 
[condition=:"proton:io", description="Connection reset by peer - on write to 
:5672 (connection aborted)"]]
[0x20ea150]: IO:FRAME: <- EOS
PN_TRANSPORT_CLOSED: proton:io: Connection reset by peer - on write to :5672 
(connection aborted)
{noformat}
 
 0.30.0-rc1 completed sending before connection goes:
{noformat}
10 messages started and aborted
[0x1f86150]: AMQP:FRAME:0 -> @transfer(20) [handle=0, delivery-id=9, 
delivery-tag=b"\x09\x00\x00\x00", message-format=0, settled=true, aborted=true]
[0x1f86150]: AMQP:FRAME:0 -> @close(24) []
[0x1f86150]:   IO:FRAME:  -> EOS
[0x1f86150]:   IO:FRAME:  <- EOS
PN_TRANSPORT_CLOSED: proton:io: Connection reset by peer - on read from :5672 
(connection aborted)
{noformat}

I did not actually observe it doing what you saw however. Do you have more of 
the context trace, better showing what the overall state when you got that 
error being generated (but perhaps not actually getting anywhere)?

If proton-c behaves like proton-j (which often copied proton-c originally) in 
this regard, the 'framing-error, connection aborted' is likely being used 
locally as a form of catch-all 'the transport was closed early' condition that 
it can sometimes then use when generating an outgoing close (which might not 
actually be sent anywhere, and isnt received in this case) in certain 
exceptional conditions depending on whether a connection error was set too. It 
seems likely there is just a timing in the 'send - vs - connection barfs out' 
race which might get it there instead of the other path where it seemingly 
errors during write.

It wouldn't surprise me if the particular behaviour / timing of this example 
scenario differs between 0.29.0 and 0.30.0 mainly due to some of the work done 
around memory usage and e.g redducing the transport buffer sizes, e.g there is 
more output waiting to be sent, and the sending of that output (before the 
generated close) varies the observed behaviour slightly.

The receiver seems to behave the same overall (barf on first transfer, exit and 
chop connection unceremoniously), and the sender seems like it perhaps varies 
slightly based on timing, which it likely always could have, in which case I'm 
not sure I really see this as a regression. It would be nice to better 
understand when each condition occurs though.

 

> [c] Regression with aborted transfers: connection closes with framing error
> ---------------------------------------------------------------------------
>
>                 Key: PROTON-2162
>                 URL: https://issues.apache.org/jira/browse/PROTON-2162
>             Project: Qpid Proton
>          Issue Type: Bug
>          Components: proton-c
>    Affects Versions: proton-c-0.29.0
>         Environment: Fedora 29, debug build
>            Reporter: Charles E. Rolke
>            Priority: Major
>
> Using normal example code:
>  * run build/cpp/examples/direct_recv
>  * run build/c/examples/send-abort
> Tested against 0.27.1, 0.28.0, 0.29.0, and 0.30.1-rc1
> In all cases direct_recv exits with a 'receiver read failure' upon receiving 
> the first aborted transfer.
> In older versions (before 0.30.x) send_abort receives a close with error:
> {noformat}
>  0 -> @close(24) [error=@error(29) [condition=:"proton:io", 
> description="Connection reset by peer - on write to :5672 (connection 
> aborted)"]]
> {noformat}
> In the 0.30.1-rc1 version the send_abort receives a close with error:
> {noformat}
>  AMQP:FRAME:0 -> @close(24) [error=@error(29) 
> [condition=:"amqp:connection:framing-error", description="connection 
> aborted"]]
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org

Reply via email to