[
https://issues.apache.org/jira/browse/PROTON-2785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17869357#comment-17869357
]
ASF GitHub Bot commented on PROTON-2785:
----------------------------------------
PatrickTaibel opened a new pull request, #431:
URL: https://github.com/apache/qpid-proton/pull/431
We've also hit the bug of the Go library that is described in PROTON-2785.
This bug happens when one large message is sent and another message with
arbitrary size gets marshaled.
Reason is that the large message does multiple calls with a too small
message size to `pn_data_encode` which leads to the error `PN_OVERFLOW` to be
set in `pn_encoder_encode` on the data field until a large enough buffer is
provided. As the `pn_message_t` is reused the error hits the second message
encoding.
Relevant Go code:
https://github.com/apache/qpid-proton/blob/9fdc19c53ea92254d9f4d5f7ff1809ed6f953503/go/pkg/amqp/message.go#L378-L393
Relevant C code:
https://github.com/apache/qpid-proton/blob/9fdc19c53ea92254d9f4d5f7ff1809ed6f953503/c/src/core/encoder.c#L395-L409
This PR adds clearing the error field when `pn_data_clear` is called.
According to the docs "A cleared pn_data_t object is equivalent to a newly
constructed one.", so that would match this fix.
Other possible way would be to not set those `PN_OVERFLOW` errors on
`pn_data_t` at all. I don't have enough overview of the internals to know if
this makes sense or not. Especially, as there are a few other errors that are
set on `pn_data_t` too which might cause similar issues.
> [Go] Message of certain size fail to be marshalled by amqp module
> -----------------------------------------------------------------
>
> Key: PROTON-2785
> URL: https://issues.apache.org/jira/browse/PROTON-2785
> Project: Qpid Proton
> Issue Type: Bug
> Reporter: Martin
> Priority: Major
> Attachments: qpid-reproducer.go
>
>
> We used to use golang bindings of qpid-proton of quite old version (v0.33.0)
> in our project. After upgrade to v0.39.0 message transfer fails on panics
> during message marshaling. Basically messages of certain size and higher (218
> bytes in my lab environment) panics on second send. The first one always
> passes without an issue:
> {code:java}
> [stack@tripleo-standalone sensubility]$ go run reproducer.go --address
> amqp://127.0.0.1:5666
> [0] Sending two messages of size 217 bytes.
> [1] Sending two messages of size 220 bytes.
> [0] Sent message ACKed.
> [0] Sent message ACKed.
> [1] Sent message ACKed.
> panic: cannot marshal string: overflow: not enough space to encodegoroutine
> 35 [running]:
> github.com/apache/qpid-proton/go/pkg/amqp.marshal({0x53d080?, 0xc00019a030?},
> 0x7f5938001d20)
>
> /home/stack/go/pkg/mod/github.com/apache/[email protected]/go/pkg/amqp/marshal.go:295
> +0x9e5
> github.com/apache/qpid-proton/go/pkg/amqp.putData({0x53d080, 0xc00019a030},
> 0xc00012bd10?)
>
> /home/stack/go/pkg/mod/github.com/apache/[email protected]/go/pkg/amqp/message.go:508
> +0x48
> github.com/apache/qpid-proton/go/pkg/amqp.(*message).put(0xc00019c280,
> 0xc00012bd48?)
>
> /home/stack/go/pkg/mod/github.com/apache/[email protected]/go/pkg/amqp/message.go:560
> +0x2b8
> github.com/apache/qpid-proton/go/pkg/amqp.(*MessageCodec).Encode(0x7f5934000c90?,
> {0x5a47b0?, 0xc00019c280?}, {0x0, 0x0, 0x0})
>
> /home/stack/go/pkg/mod/github.com/apache/[email protected]/go/pkg/amqp/message.go:380
> +0x97
> github.com/apache/qpid-proton/go/pkg/electron.(*sender).send(0xc0000bc000,
> 0xc000194150)
>
> /home/stack/go/pkg/mod/github.com/apache/[email protected]/go/pkg/electron/sender.go:197
> +0x11d
> github.com/apache/qpid-proton/go/pkg/electron.(*sender).trySend(0xc0000bc000)
>
> /home/stack/go/pkg/mod/github.com/apache/[email protected]/go/pkg/electron/sender.go:187
> +0x25
> github.com/apache/qpid-proton/go/pkg/electron.(*sender).startSend(...)
>
> /home/stack/go/pkg/mod/github.com/apache/[email protected]/go/pkg/electron/sender.go:179
> github.com/apache/qpid-proton/go/pkg/electron.(*sender).SendAsyncTimeout.func1()
>
> /home/stack/go/pkg/mod/github.com/apache/[email protected]/go/pkg/electron/sender.go:230
> +0xbb
> github.com/apache/qpid-proton/go/pkg/proton.(*Engine).Run(0xc0001261b0)
>
> /home/stack/go/pkg/mod/github.com/apache/[email protected]/go/pkg/proton/engine.go:376
> +0x134
> github.com/apache/qpid-proton/go/pkg/electron.(*connection).run(0xc0001320f0)
>
> /home/stack/go/pkg/mod/github.com/apache/[email protected]/go/pkg/electron/connection.go:241
> +0x3f
> created by github.com/apache/qpid-proton/go/pkg/electron.NewConnection in
> goroutine 1
>
> /home/stack/go/pkg/mod/github.com/apache/[email protected]/go/pkg/electron/connection.go:224
> +0x545
> exit status 2{code}
>
> We used to transfer much larger messages, so this is quite problematic for
> us. The AMQP components we use for message transfer is qdrouterd mesh, but it
> is reproducible on single qdr too.
> {code:java}
> [root@tripleo-standalone ~]# podman exec -it qdr qdstat -v
> 2023-12-22 10:26:29.995103 UTC
> Standalone_n6PHE7MhBoAhzi8Router Statistics
> attr value
> =============================================================
> Version 1.17.1
> Mode standalone
> Router Id Standalone_n6PHE7MhBoAhzi8
> Worker Threads 4
> Uptime 002:21:05:22
> VmSize 332 MiB
> Area 0
> Link Routes 0
> Auto Links 0
> Links 2
> Nodes 0
> Addresses 4
> Connections 1
> Presettled Count 278
> Dropped Presettled Count 3
> Accepted Count 1894
> Rejected Count 0
> Released Count 0
> Modified Count 0
> Deliveries Delayed > 1sec 0
> Deliveries Delayed > 10sec 0
> Deliveries Stuck > 10sec 0
> Deliveries to Fallback 0
> Links Blocked 0
> Ingress Count 2173
> Egress Count 2172
> Transit Count 0
> Deliveries from Route Container 0
> Deliveries to Route Container 0
> [root@tripleo-standalone ~]# {code}
> Minimal reproducer is attached: [^qpid-reproducer.go]
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]