[jira] [Created] (PROTON-1936) Support cross compiling to Windows from Linux

2018-09-17 Thread Marcel Meulemans (JIRA)
Marcel Meulemans created PROTON-1936:


 Summary: Support cross compiling to Windows from Linux
 Key: PROTON-1936
 URL: https://issues.apache.org/jira/browse/PROTON-1936
 Project: Qpid Proton
  Issue Type: Improvement
  Components: proton-c
Affects Versions: proton-c-0.25.0
Reporter: Marcel Meulemans


I am cross compiling proton for Windows via docker (multiarch/crossbuild) and 
running into a few minor issues that make it not work out of the box (mainly 
include file casing). Pull request will follow, more details there ... it would 
be nice if this made it upstream.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Created] (PROTON-1892) Deliveries on different links use the same delivery-id

2018-07-10 Thread Marcel Meulemans (JIRA)
Marcel Meulemans created PROTON-1892:


 Summary: Deliveries on different links use the same delivery-id
 Key: PROTON-1892
 URL: https://issues.apache.org/jira/browse/PROTON-1892
 Project: Qpid Proton
  Issue Type: Bug
  Components: proton-j
Affects Versions: proton-j-0.27.1
Reporter: Marcel Meulemans
 Attachments: proton-j-delivery-id-fix.patch, proton-trace.log

Given a session with two outgoing links the situation can occur that two 
deliveries on separate links share the same delivery-id. This situation occurs 
when a multi frame transfer is being sent on link A and a new (single frame) 
transfer is sent (multiplexed) on link B before the delivery on link A 
completes. The reason this occurs is because the increment of the delivery id 
counter (maintained per session) is delayed until the entire (multi frame) 
delivery is complete 
([here|https://github.com/apache/qpid-proton-j/blob/e5a7dcade2996b2b68967949ddf1377f954bf579/proton-j/src/main/java/org/apache/qpid/proton/engine/impl/TransportImpl.java#L619])
 allowing the second delivery to get the same delivery id when calling 
getOutgoingDeliveryId 
[here|https://github.com/apache/qpid-proton-j/blob/e5a7dcade2996b2b68967949ddf1377f954bf579/proton-j/src/main/java/org/apache/qpid/proton/engine/impl/TransportImpl.java#L559]

My 100% reproduction scenario is as follows:
 * Run artemis (2.6.2 which uses proton-j 0.27.1) with an AMQP connector
 * Send a large message (10MB) to queue A
 * Send a couple of small messages to queue B
 * Connect a proton-c based client with a small maxFrameSize (8K) and limited 
credit to artemis and simultaneously subscribe to both queues (I think a flow 
frame triggers artemis to initiate a transfer therefore the limited credit).

With proton-c trace logging enable you will get something like this:

[^proton-trace.log]

The attached patch fixes the issue.

[^proton-j-delivery-id-fix.patch]

 

 

 

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Commented] (PROTON-1846) [proton-c] Message decode fails with PN_OUT_OF_MEMORY if there are large lists in the message

2018-06-07 Thread Marcel Meulemans (JIRA)


[ 
https://issues.apache.org/jira/browse/PROTON-1846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16504462#comment-16504462
 ] 

Marcel Meulemans commented on PROTON-1846:
--

I tried the diff above and and the {{uint32_t}} seems to introduce some 
unwanted side effects due to the with the singed/unsigned "magic" in 
{{pn_data_point/pn_data_restore}} (did look into the details, just that this 
branch, 
[https://github.com/apache/qpid-proton/blob/master/c/src/core/codec.c#L1177] 
isn't hit when it should be). Using {{typedef int32_t pni_nid_t;}} and fixing 
PNI_NID_MAX accordingly did worked for me.

> [proton-c]  Message decode fails with PN_OUT_OF_MEMORY if there are large 
> lists in the message
> --
>
> Key: PROTON-1846
> URL: https://issues.apache.org/jira/browse/PROTON-1846
> Project: Qpid Proton
>  Issue Type: Bug
>  Components: proton-c
>Affects Versions: proton-c-0.22.0
>Reporter: Ganesh Murthy
>Priority: Major
> Attachments: send_large_structured_body.js
>
>
> Steps to reproduce -
>  
>  # Start the Qpid Dispatch router
>  # Run the following script that creates a bunch of addresses
>  # for i in `seq 1 6546`; do echo 
> "\{\"prefix\":\"address-$i\",\"distribution\":\"balanced\"}" | qdmanage 
> CREATE --type=org.apache.qpid.dispatch.router.config.address --name 
> address-$i --stdin; done
>  # now run qdmanage QUERY --type=address
>  # You will receive a Data error (-10)
> The following diff seems to fix the issue
> diff --git a/c/src/core/data.h b/c/src/core/data.h
> index 94dc7d67..f4320e2a 100644
> --- a/c/src/core/data.h
> +++ b/c/src/core/data.h
> @@ -27,7 +27,7 @@
>  #include "decoder.h"
>  #include "encoder.h"
>  
> -typedef uint16_t pni_nid_t;
> +typedef uint32_t pni_nid_t;
>  #define PNI_NID_MAX ((pni_nid_t)-1)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Updated] (DISPATCH-1019) Messaging instability in networks with many clients / addresses.

2018-05-29 Thread Marcel Meulemans (JIRA)


 [ 
https://issues.apache.org/jira/browse/DISPATCH-1019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcel Meulemans updated DISPATCH-1019:
---
Description: 
After DISPATCH-966 has been fixed I am still experiencing problems in a network 
with many clients / addresses. I am running a three node fully connected mesh 
of dispatch routers with 1 attached clients all with two address messaging 
at around 100 msg/sec.

In the logs I am seeing the following errors:

{{2018-05-29 14:31:05.145732 + ERROR (error) Invalid message: Insufficient 
Data to Determine Tag}}
 {{2018-05-29 14:31:05.145748 + ERROR (error) Invalid message: Can't 
convert message field body}}

Which, in turn, lead to python errors like:

{{2018-05-29 14:31:05.145971 + ROUTER_MA (trace) RCVD: MAU(id=None pv=1 
area=0 mobile_seq=0)}}
{{2018-05-29 14:31:05.146130 + ROUTER (error) Exception in control message 
processing}}
{{ Traceback (most recent call last):}}
{{ File "/usr/lib/python2.7/qpid_dispatch_internal/router/engine.py", line 157, 
in handleControlMessage}}
{{ self.mobile_address_engine.handle_mau(msg, now)}}
{{ File "/usr/lib/python2.7/qpid_dispatch_internal/router/mobile.py", line 97, 
in handle_mau}}
{{ node = self.node_tracker.router_node(msg.id)}}
{{ File "/usr/lib/python2.7/qpid_dispatch_internal/router/node.py", line 363, 
in router_node}}
{{ return self.nodes[node_id]}}
{{ KeyError: None}}
{{2018-05-29 14:31:05.146175 + ROUTER (error) Control message error: 
opcode=MAU body=None}}

I have tracked down the cause of the "Insufficient Data to Determine Tag" 
message to the following: During the call to {{qd_parse}} of the {{MAU}} 
message the {{qd_iterator_t}} reaches the end of the buffer list before it 
should. Specifically the call to {{qd_iterator_advance}} here 
[https://github.com/apache/qpid-dispatch/blob/master/src/parse.c#L151,] "fails; 
to move forward a certain number of bytes (e.g. 31) even though the 
{{iterator->view_pointer->remaining}} value has plenty of bytes left (e.g. 
84802). The "fail" is because we reach the end of the buffer list before we 
should (here 
[https://github.com/apache/qpid-dispatch/blob/master/src/iterator.c#L323|https://github.com/apache/qpid-dispatch/blob/master/src/iterator.c#L323).]).

What I have not been able to figure out yet is why this happens because it is 
not consistent. Many large MAU message are parsed correctly only sometimes not. 
I am able to reproduce these errors every time I run my tests. There may be a 
time component involved because the more logging I to the router code, the less 
often the errors seem to occur.

  was:
After DISPATCH-966 has been fixed I am still experiencing problems in a network 
with many clients / addresses. I am running a three node fully connected mesh 
of dispatch routers with 1 attached clients all with two address messaging 
at around 100 msg/sec.

In the logs I am seeing the following errors:

{{2018-05-29 14:31:05.145732 + ERROR (error) Invalid message: Insufficient 
Data to Determine Tag}}
 {{2018-05-29 14:31:05.145748 + ERROR (error) Invalid message: Can't 
convert message field body}}

Which, in turn, lead to python errors like:

{{2018-05-29 14:31:05.145971 + ROUTER_MA (trace) RCVD: MAU(id=None pv=1 
area=0 mobile_seq=0)}}
{{ 2018-05-29 14:31:05.146130 + ROUTER (error) Exception in control message 
processing}}
{{ Traceback (most recent call last):}}
{{File "/usr/lib/python2.7/qpid_dispatch_internal/router/engine.py", line 157, 
in handleControlMessage}}
{{self.mobile_address_engine.handle_mau(msg, now)}}
{{File "/usr/lib/python2.7/qpid_dispatch_internal/router/mobile.py", line 97, 
in handle_mau}}
{{node = self.node_tracker.router_node(msg.id)}}
{{File "/usr/lib/python2.7/qpid_dispatch_internal/router/node.py", line 363, in 
router_node}}
{{return self.nodes[node_id]}}
{{ KeyError: None}}
{{ 2018-05-29 14:31:05.146175 + ROUTER (error) Control message error: 
opcode=MAU body=None}}

I have tracked down the cause of the "Insufficient Data to Determine Tag" 
message to the following: During the call to {{qd_parse}} of the {{MAU}} 
message the {{qd_iterator_t}} reaches the end of the buffer list before it 
should. Specifically the call to {{qd_iterator_advance}} here 
[https://github.com/apache/qpid-dispatch/blob/master/src/parse.c#L151,] "fails; 
to move forward a certain number of bytes (e.g. 31) even though the 
{{iterator->view_pointer->remaining}} value has plenty of bytes left (e.g. 
84802). The "fail" is because we reach the end of the buffer list before we 
should (here 
[https://github.com/apache/qpid-dispatch/blob/master/src/iterator.c#L323|https://github.com/apache/qpid-dispatch/blob/master/src/iterator.c#L323).]).

What I have not been able to figure out yet is why this happens because it is 
not consistent. Many large MAU message are parsed correctly only sometimes not. 
I am able to reproduce these errors every 

[jira] [Updated] (DISPATCH-1019) Messaging instability in networks with many clients / addresses.

2018-05-29 Thread Marcel Meulemans (JIRA)


 [ 
https://issues.apache.org/jira/browse/DISPATCH-1019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcel Meulemans updated DISPATCH-1019:
---
Description: 
After DISPATCH-966 has been fixed I am still experiencing problems in a network 
with many clients / addresses. I am running a three node fully connected mesh 
of dispatch routers with 1 attached clients all with two address messaging 
at around 100 msg/sec.

In the logs I am seeing the following errors:

{{2018-05-29 14:31:05.145732 + ERROR (error) Invalid message: Insufficient 
Data to Determine Tag}}
 {{2018-05-29 14:31:05.145748 + ERROR (error) Invalid message: Can't 
convert message field body}}

Which, in turn, lead to python errors like:

{{2018-05-29 14:31:05.145971 + ROUTER_MA (trace) RCVD: MAU(id=None pv=1 
area=0 mobile_seq=0)}}
{{ 2018-05-29 14:31:05.146130 + ROUTER (error) Exception in control message 
processing}}
{{ Traceback (most recent call last):}}
{{File "/usr/lib/python2.7/qpid_dispatch_internal/router/engine.py", line 157, 
in handleControlMessage}}
{{self.mobile_address_engine.handle_mau(msg, now)}}
{{File "/usr/lib/python2.7/qpid_dispatch_internal/router/mobile.py", line 97, 
in handle_mau}}
{{node = self.node_tracker.router_node(msg.id)}}
{{File "/usr/lib/python2.7/qpid_dispatch_internal/router/node.py", line 363, in 
router_node}}
{{return self.nodes[node_id]}}
{{ KeyError: None}}
{{ 2018-05-29 14:31:05.146175 + ROUTER (error) Control message error: 
opcode=MAU body=None}}

I have tracked down the cause of the "Insufficient Data to Determine Tag" 
message to the following: During the call to {{qd_parse}} of the {{MAU}} 
message the {{qd_iterator_t}} reaches the end of the buffer list before it 
should. Specifically the call to {{qd_iterator_advance}} here 
[https://github.com/apache/qpid-dispatch/blob/master/src/parse.c#L151,] "fails; 
to move forward a certain number of bytes (e.g. 31) even though the 
{{iterator->view_pointer->remaining}} value has plenty of bytes left (e.g. 
84802). The "fail" is because we reach the end of the buffer list before we 
should (here 
[https://github.com/apache/qpid-dispatch/blob/master/src/iterator.c#L323|https://github.com/apache/qpid-dispatch/blob/master/src/iterator.c#L323).]).

What I have not been able to figure out yet is why this happens because it is 
not consistent. Many large MAU message are parsed correctly only sometimes not. 
I am able to reproduce these errors every time I run my tests. There may be a 
time component involved because the more logging I to the router code, the less 
often the errors seem to occur.

  was:
After DISPATCH-966 has been fixed I am still experiencing problems in a network 
with many clients / addresses. I am running a three node fully connected mesh 
of dispatch routers with 1 attached clients all with two address messaging 
at around 100 msg/sec.

In the logs I am seeing the following errors:

{{2018-05-29 14:31:05.145732 + ERROR (error) Invalid message: Insufficient 
Data to Determine Tag}}
{{2018-05-29 14:31:05.145748 + ERROR (error) Invalid message: Can't convert 
message field body}}

Which, in turn, lead to python errors like:

{{2018-05-29 14:31:05.145971 + ROUTER_MA (trace) RCVD: MAU(id=None pv=1 
area=0 mobile_seq=0)}}
{{2018-05-29 14:31:05.146130 + ROUTER (error) Exception in control message 
processing}}
{{Traceback (most recent call last):}}
{{ File "/usr/lib/python2.7/qpid_dispatch_internal/router/engine.py", line 157, 
in handleControlMessage}}
{{ self.mobile_address_engine.handle_mau(msg, now)}}
{{ File "/usr/lib/python2.7/qpid_dispatch_internal/router/mobile.py", line 97, 
in handle_mau}}
{{ node = self.node_tracker.router_node(msg.id)}}
{{ File "/usr/lib/python2.7/qpid_dispatch_internal/router/node.py", line 363, 
in router_node}}
{{ return self.nodes[node_id]}}
{{KeyError: None}}
{{2018-05-29 14:31:05.146175 + ROUTER (error) Control message error: 
opcode=MAU body=None}}

I have tracked down the cause of the "Insufficient Data to Determine Tag" 
message to the following: During the call to {{qd_parse}} of the {{MAU}} 
message the {{qd_iterator_t}} reaches the end of the buffer list before it 
should. Specifically the call to {{qd_iterator_advance}} here 
[https://github.com/apache/qpid-dispatch/blob/master/src/parse.c#L151,] "fails; 
to move forward a certain number of bytes (e.g. 31) even though the 
{{iterator->view_pointer->remaining}} value has plenty of bytes left (e.g. 
84802). The "fail" is because we reach the end of the buffer list before we 
should (here 
[https://github.com/apache/qpid-dispatch/blob/master/src/iterator.c#L323|https://github.com/apache/qpid-dispatch/blob/master/src/iterator.c#L323).]).

What I have not been able to figure out yet is why this happens because it is 
not consistent. Many large MAU message are parsed correctly only sometimes not. 
I am able to reproduce these errors every 

[jira] [Created] (DISPATCH-1019) Messaging instability in networks with many clients / addresses.

2018-05-29 Thread Marcel Meulemans (JIRA)
Marcel Meulemans created DISPATCH-1019:
--

 Summary: Messaging instability in networks with many clients / 
addresses.
 Key: DISPATCH-1019
 URL: https://issues.apache.org/jira/browse/DISPATCH-1019
 Project: Qpid Dispatch
  Issue Type: Bug
  Components: Routing Engine
Affects Versions: 1.1.0
Reporter: Marcel Meulemans


After DISPATCH-966 has been fixed I am still experiencing problems in a network 
with many clients / addresses. I am running a three node fully connected mesh 
of dispatch routers with 1 attached clients all with two address messaging 
at around 100 msg/sec.

In the logs I am seeing the following errors:

{{2018-05-29 14:31:05.145732 + ERROR (error) Invalid message: Insufficient 
Data to Determine Tag}}
{{2018-05-29 14:31:05.145748 + ERROR (error) Invalid message: Can't convert 
message field body}}

Which, in turn, lead to python errors like:

{{2018-05-29 14:31:05.145971 + ROUTER_MA (trace) RCVD: MAU(id=None pv=1 
area=0 mobile_seq=0)}}
{{2018-05-29 14:31:05.146130 + ROUTER (error) Exception in control message 
processing}}
{{Traceback (most recent call last):}}
{{ File "/usr/lib/python2.7/qpid_dispatch_internal/router/engine.py", line 157, 
in handleControlMessage}}
{{ self.mobile_address_engine.handle_mau(msg, now)}}
{{ File "/usr/lib/python2.7/qpid_dispatch_internal/router/mobile.py", line 97, 
in handle_mau}}
{{ node = self.node_tracker.router_node(msg.id)}}
{{ File "/usr/lib/python2.7/qpid_dispatch_internal/router/node.py", line 363, 
in router_node}}
{{ return self.nodes[node_id]}}
{{KeyError: None}}
{{2018-05-29 14:31:05.146175 + ROUTER (error) Control message error: 
opcode=MAU body=None}}

I have tracked down the cause of the "Insufficient Data to Determine Tag" 
message to the following: During the call to {{qd_parse}} of the {{MAU}} 
message the {{qd_iterator_t}} reaches the end of the buffer list before it 
should. Specifically the call to {{qd_iterator_advance}} here 
[https://github.com/apache/qpid-dispatch/blob/master/src/parse.c#L151,] "fails; 
to move forward a certain number of bytes (e.g. 31) even though the 
{{iterator->view_pointer->remaining}} value has plenty of bytes left (e.g. 
84802). The "fail" is because we reach the end of the buffer list before we 
should (here 
[https://github.com/apache/qpid-dispatch/blob/master/src/iterator.c#L323|https://github.com/apache/qpid-dispatch/blob/master/src/iterator.c#L323).]).

What I have not been able to figure out yet is why this happens because it is 
not consistent. Many large MAU message are parsed correctly only sometimes not. 
I am able to reproduce these errors every time I run my tests. There may be a 
time component involved because the more logging I to the router code, the less 
often the errors seem to occur.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Commented] (DISPATCH-966) Qpid dispatch unstable inter-router connections

2018-05-17 Thread Marcel Meulemans (JIRA)

[ 
https://issues.apache.org/jira/browse/DISPATCH-966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16479173#comment-16479173
 ] 

Marcel Meulemans commented on DISPATCH-966:
---

[~ganeshmurthy], you are correct about PROTON-1514 fixing the initial symptoms. 
I just reran my test setup with an current master branch of proton and dispatch 
without the allowUnsettledMulticast:true setting and I no longer see the inter 
router connections dropping.

> Qpid dispatch unstable inter-router connections
> ---
>
> Key: DISPATCH-966
> URL: https://issues.apache.org/jira/browse/DISPATCH-966
> Project: Qpid Dispatch
>  Issue Type: Bug
>  Components: Routing Engine
>Affects Versions: 1.0.1
>Reporter: Marcel Meulemans
>Assignee: Ted Ross
>Priority: Blocker
> Fix For: 1.1.0
>
> Attachments: inconsistent-settlement.log, 
> qdrouterd-unsettled-true.log, qdrouterd.conf, qdrouterd.log, 
> router-unsettled-true.dump, router.dump
>
>
> I am running a three node fully connected mesh of dispatch routers with 1 
> attached clients and I am seeing some unstable inter-router connections (I am 
> sending around 1000 small, less than 1K, messages per second through the 
> network). The inter-router connections fail every so many seconds with the 
> message:
> {{Connection to router-2:55672 failed: amqp:session:invalid-field sequencing 
> error, expected delivery-id 7, got 6}}
> (the numbers 7 and 6 differ per connection loss)
> In wireshark, using the attached tcpdump capture, I can see that every time 
> before the inter router connection is dropped, therw is a rejected 
> disposition with the message:
> {{Condition: qd:forbidden}}
> {{Description: Deliveries to a multicast address must be pre-settled}}
> The routers are connected as follows:
>  * router-0 -> router-1
>  * router-0 -> router-2
>  * router-1 -> router-2
> The routers are running as a docker container (debian stretch) on google 
> compute engine machines (every router on a separate node).
> Attached are:
>  * my qdrouter.conf (from one of the routers)
>  * a log snippet from router-0 at debug level from connection drop to 
> connection re-established to connection drop again.
>  * a tcpdump capture of the inter-router connection between router-0 and 
> router-1 during which several of the failures occur
> Versions:
>  * qpid-dispatch@1.0.1-rc1
>  * qpid-proton@0.20.0
>  
> [^qdrouterd.log]
> [^qdrouterd.conf]
> [^router.dump]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Commented] (DISPATCH-966) Qpid dispatch unstable inter-router connections

2018-05-17 Thread Marcel Meulemans (JIRA)

[ 
https://issues.apache.org/jira/browse/DISPATCH-966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16479167#comment-16479167
 ] 

Marcel Meulemans commented on DISPATCH-966:
---

[~tedross], that's great! Is the other problem you are referring too related to 
the further handling of such multi transfer deliveries by the router (I think I 
see the large MAU message arriving, but not being processed after the receive 
is complete)? If so I can stop looking into this :P

> Qpid dispatch unstable inter-router connections
> ---
>
> Key: DISPATCH-966
> URL: https://issues.apache.org/jira/browse/DISPATCH-966
> Project: Qpid Dispatch
>  Issue Type: Bug
>  Components: Routing Engine
>Affects Versions: 1.0.1
>Reporter: Marcel Meulemans
>Assignee: Ted Ross
>Priority: Blocker
> Fix For: 1.1.0
>
> Attachments: inconsistent-settlement.log, 
> qdrouterd-unsettled-true.log, qdrouterd.conf, qdrouterd.log, 
> router-unsettled-true.dump, router.dump
>
>
> I am running a three node fully connected mesh of dispatch routers with 1 
> attached clients and I am seeing some unstable inter-router connections (I am 
> sending around 1000 small, less than 1K, messages per second through the 
> network). The inter-router connections fail every so many seconds with the 
> message:
> {{Connection to router-2:55672 failed: amqp:session:invalid-field sequencing 
> error, expected delivery-id 7, got 6}}
> (the numbers 7 and 6 differ per connection loss)
> In wireshark, using the attached tcpdump capture, I can see that every time 
> before the inter router connection is dropped, therw is a rejected 
> disposition with the message:
> {{Condition: qd:forbidden}}
> {{Description: Deliveries to a multicast address must be pre-settled}}
> The routers are connected as follows:
>  * router-0 -> router-1
>  * router-0 -> router-2
>  * router-1 -> router-2
> The routers are running as a docker container (debian stretch) on google 
> compute engine machines (every router on a separate node).
> Attached are:
>  * my qdrouter.conf (from one of the routers)
>  * a log snippet from router-0 at debug level from connection drop to 
> connection re-established to connection drop again.
>  * a tcpdump capture of the inter-router connection between router-0 and 
> router-1 during which several of the failures occur
> Versions:
>  * qpid-dispatch@1.0.1-rc1
>  * qpid-proton@0.20.0
>  
> [^qdrouterd.log]
> [^qdrouterd.conf]
> [^router.dump]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Commented] (DISPATCH-966) Qpid dispatch unstable inter-router connections

2018-05-17 Thread Marcel Meulemans (JIRA)

[ 
https://issues.apache.org/jira/browse/DISPATCH-966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16478735#comment-16478735
 ] 

Marcel Meulemans commented on DISPATCH-966:
---

Sorry for the show response, but now I finally got around to doing some follow 
up on this. Turns out the python exceptions are a side effect, not the actual 
problem (however I'll try to reproduce the stack trace later if only to improve 
the python response to the situation).

The actual problem is cause by this code (as far as I can see): 
[https://github.com/apache/qpid-dispatch/blob/master/src/message.c#L1168] ... 
In my situation with 1 clients (each with two unique addresses), the MAU 
messages exchanged between routers can become quite large, so large that the 
limit set on the number of msg->content->buffers 
(qd_message_Q2_holdoff_should_block) is hit. This holdoff is unblocked when 
buffers are freed up by sending them out, but as the MAU message is not being 
sent out the holdoff is never unblocked. As a consequence all communication on 
this link comes to a halt (some message still arrive on the link until the 
credit is used up, but are never processed by the router code) and eventually 
the network breaks down. It seems to me that this blocking should not occur on 
messages that are not going to be send out. I verified my theory by increasing 
QD_QLIMIT_Q2_UPPER and observing that the problem goes away, but that is of 
course not a correct solution. I don't know enough about the router internals 
to propose a solution other than the qd_message_Q2_holdoff_should_block 
implementation 
([https://github.com/apache/qpid-dispatch/blob/master/src/message.c#L1950)] 
should probably also take into account that not all messages are sent out to 
other destinations.

Btw, I have not been able to figure out how this leads to the initial error 
"Deliveries to a multicast address must be pre-settled". What I did notice it 
that proton trace logging is showing inconsistent settlement flag for messages 
that are split over multiple transfer frames (see 
[^inconsistent-settlement.log]).

> Qpid dispatch unstable inter-router connections
> ---
>
> Key: DISPATCH-966
> URL: https://issues.apache.org/jira/browse/DISPATCH-966
> Project: Qpid Dispatch
>  Issue Type: Bug
>  Components: Routing Engine
>Affects Versions: 1.0.1
>Reporter: Marcel Meulemans
>Assignee: Ted Ross
>Priority: Blocker
> Fix For: 1.1.0
>
> Attachments: inconsistent-settlement.log, 
> qdrouterd-unsettled-true.log, qdrouterd.conf, qdrouterd.log, 
> router-unsettled-true.dump, router.dump
>
>
> I am running a three node fully connected mesh of dispatch routers with 1 
> attached clients and I am seeing some unstable inter-router connections (I am 
> sending around 1000 small, less than 1K, messages per second through the 
> network). The inter-router connections fail every so many seconds with the 
> message:
> {{Connection to router-2:55672 failed: amqp:session:invalid-field sequencing 
> error, expected delivery-id 7, got 6}}
> (the numbers 7 and 6 differ per connection loss)
> In wireshark, using the attached tcpdump capture, I can see that every time 
> before the inter router connection is dropped, therw is a rejected 
> disposition with the message:
> {{Condition: qd:forbidden}}
> {{Description: Deliveries to a multicast address must be pre-settled}}
> The routers are connected as follows:
>  * router-0 -> router-1
>  * router-0 -> router-2
>  * router-1 -> router-2
> The routers are running as a docker container (debian stretch) on google 
> compute engine machines (every router on a separate node).
> Attached are:
>  * my qdrouter.conf (from one of the routers)
>  * a log snippet from router-0 at debug level from connection drop to 
> connection re-established to connection drop again.
>  * a tcpdump capture of the inter-router connection between router-0 and 
> router-1 during which several of the failures occur
> Versions:
>  * qpid-dispatch@1.0.1-rc1
>  * qpid-proton@0.20.0
>  
> [^qdrouterd.log]
> [^qdrouterd.conf]
> [^router.dump]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Updated] (DISPATCH-966) Qpid dispatch unstable inter-router connections

2018-05-17 Thread Marcel Meulemans (JIRA)

 [ 
https://issues.apache.org/jira/browse/DISPATCH-966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcel Meulemans updated DISPATCH-966:
--
Attachment: inconsistent-settlement.log

> Qpid dispatch unstable inter-router connections
> ---
>
> Key: DISPATCH-966
> URL: https://issues.apache.org/jira/browse/DISPATCH-966
> Project: Qpid Dispatch
>  Issue Type: Bug
>  Components: Routing Engine
>Affects Versions: 1.0.1
>Reporter: Marcel Meulemans
>Assignee: Ted Ross
>Priority: Blocker
> Fix For: 1.1.0
>
> Attachments: inconsistent-settlement.log, 
> qdrouterd-unsettled-true.log, qdrouterd.conf, qdrouterd.log, 
> router-unsettled-true.dump, router.dump
>
>
> I am running a three node fully connected mesh of dispatch routers with 1 
> attached clients and I am seeing some unstable inter-router connections (I am 
> sending around 1000 small, less than 1K, messages per second through the 
> network). The inter-router connections fail every so many seconds with the 
> message:
> {{Connection to router-2:55672 failed: amqp:session:invalid-field sequencing 
> error, expected delivery-id 7, got 6}}
> (the numbers 7 and 6 differ per connection loss)
> In wireshark, using the attached tcpdump capture, I can see that every time 
> before the inter router connection is dropped, therw is a rejected 
> disposition with the message:
> {{Condition: qd:forbidden}}
> {{Description: Deliveries to a multicast address must be pre-settled}}
> The routers are connected as follows:
>  * router-0 -> router-1
>  * router-0 -> router-2
>  * router-1 -> router-2
> The routers are running as a docker container (debian stretch) on google 
> compute engine machines (every router on a separate node).
> Attached are:
>  * my qdrouter.conf (from one of the routers)
>  * a log snippet from router-0 at debug level from connection drop to 
> connection re-established to connection drop again.
>  * a tcpdump capture of the inter-router connection between router-0 and 
> router-1 during which several of the failures occur
> Versions:
>  * qpid-dispatch@1.0.1-rc1
>  * qpid-proton@0.20.0
>  
> [^qdrouterd.log]
> [^qdrouterd.conf]
> [^router.dump]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Commented] (DISPATCH-974) Getting connections via the router management protocol causes AMQP framing errors

2018-04-24 Thread Marcel Meulemans (JIRA)

[ 
https://issues.apache.org/jira/browse/DISPATCH-974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16449877#comment-16449877
 ] 

Marcel Meulemans commented on DISPATCH-974:
---

Dug a little deeper and it seems that the problem is somehow related to 
response size. When I use qdmanage to get only the container names of every 
connection (qdmanage -t 30 query --type=org.apache.qpid.dispatch.connection 
container) I do get the full 5000 connections. I even tested up to 25000 
connections and that still works.

> Getting connections via the router management protocol causes AMQP framing 
> errors
> -
>
> Key: DISPATCH-974
> URL: https://issues.apache.org/jira/browse/DISPATCH-974
> Project: Qpid Dispatch
>  Issue Type: Bug
>  Components: Management Agent
>Affects Versions: 1.0.1
>Reporter: Marcel Meulemans
>Priority: Major
> Attachments: qdrouter-frame-errors.pcapng.gz
>
>
> I am running a standalone router with 5000 clients connected. When I try to 
> get all connections via qdstat (qdstat --limit 5000 -c) something goes wrong 
> (seems to be a framing error). The output from qdstat is:
> {{ MessageException: [-10]: data error: (null)}}
> The problems seems to somehow be related to result size because when I set 
> the limit to less I get the list of connections as expected. In my situation 
> the critical limit is 3447 (i.e. 3447 result in the expected list of 
> connections, 3448 result in the error above). It does not seem to be frame 
> size related because getting 3447 connection is already spread over transfer 
> frames (256182, 256512 and 159399 bytes).
> The error is not qdstat related because using some plain proton code to 
> create a management query results in the same problem. Ultimately the call to 
> pn_message_decode with data receive from the router fails (also wireshark can 
> not decode the final frame).
> I have attached a wireshark dump to the qdstat session with the router 
> ([^qdrouter-frame-errors.pcapng.gz]). The logs of the router (at info level) 
> contain no further information.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Created] (DISPATCH-974) Getting connections via the router management protocol causes AMQP framing errors

2018-04-23 Thread Marcel Meulemans (JIRA)
Marcel Meulemans created DISPATCH-974:
-

 Summary: Getting connections via the router management protocol 
causes AMQP framing errors
 Key: DISPATCH-974
 URL: https://issues.apache.org/jira/browse/DISPATCH-974
 Project: Qpid Dispatch
  Issue Type: Bug
  Components: Management Agent
Affects Versions: 1.0.1
Reporter: Marcel Meulemans
 Attachments: qdrouter-frame-errors.pcapng.gz

I am running a standalone router with 5000 clients connected. When I try to get 
all connections via qdstat (qdstat --limit 5000 -c) something goes wrong (seems 
to be a framing error). The output from qdstat is:

{{ MessageException: [-10]: data error: (null)}}

The problems seems to somehow be related to result size because when I set the 
limit to less I get the list of connections as expected. In my situation the 
critical limit is 3447 (i.e. 3447 result in the expected list of connections, 
3448 result in the error above). It does not seem to be frame size related 
because getting 3447 connection is already spread over transfer frames (256182, 
256512 and 159399 bytes).

The error is not qdstat related because using some plain proton code to create 
a management query results in the same problem. Ultimately the call to 
pn_message_decode with data receive from the router fails (also wireshark can 
not decode the final frame).

I have attached a wireshark dump to the qdstat session with the router 
([^qdrouter-frame-errors.pcapng.gz]). The logs of the router (at info level) 
contain no further information.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Comment Edited] (DISPATCH-966) Qpid dispatch unstable inter-router connections

2018-04-13 Thread Marcel Meulemans (JIRA)

[ 
https://issues.apache.org/jira/browse/DISPATCH-966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16437039#comment-16437039
 ] 

Marcel Meulemans edited comment on DISPATCH-966 at 4/13/18 11:18 AM:
-

Added to the configuration and ran the test again several times. However now I 
see some things I did not expect; the network seems to come up correctly, but 
after a while it seems to fail in a weird way. The inter-router connections do 
not seems to drop anymore but routing via the network does not seem to work 
(i.e.ROUTER_LS (info) Computed next hops: {} and qdstat -n show only a single 
router). Maybe this is the issue unmasked by allowing unsettled multicasts?

I attached two more files:
 * the logs of router-0 (from router start until slightly after the network 
fails) at info level
 * a tcpdump to the inter router communication to an from router-0 (tcpdump -i 
eth0 tcp port 55672 -s 65535) also from router start until slightly after the 
network fails

I hope this helps (the dump is fairly large, so I hope you can find any hidden 
needles).

-- 
 Marcel


was (Author: mmeulemans):
Added to the configuration and ran the test again several times. However now I 
see some things I did not expect; the network seems to come up correctly, but 
after a while it seems to fail in a weird way. The inter-router connections do 
not seems to drop anymore but routing via the network does not seem to work 
(i.e.ROUTER_LS (info) Computed next hops: {} and qdstat -n show only a single 
router). Maybe this is the issue unmasked by allowing unsettled multicasts?

I attached two more file:
 * the logs of router-0 (from router start until slightly after the network 
fails) at info level
 * a tcpdump to the inter router communication to an from router-0 (tcpdump -i 
eth0 tcp port 55672 -s 65535)

I hope this helps (the dump is fairly large, so I hope you can find any hidden 
needles).

-- 
Marcel

> Qpid dispatch unstable inter-router connections
> ---
>
> Key: DISPATCH-966
> URL: https://issues.apache.org/jira/browse/DISPATCH-966
> Project: Qpid Dispatch
>  Issue Type: Bug
>  Components: Routing Engine
>Affects Versions: 1.0.1
>Reporter: Marcel Meulemans
>Assignee: Ted Ross
>Priority: Major
> Attachments: qdrouterd-unsettled-true.log, qdrouterd.conf, 
> qdrouterd.log, router-unsettled-true.dump, router.dump
>
>
> I am running a three node fully connected mesh of dispatch routers with 1 
> attached clients and I am seeing some unstable inter-router connections (I am 
> sending around 1000 small, less than 1K, messages per second through the 
> network). The inter-router connections fail every so many seconds with the 
> message:
> {{Connection to router-2:55672 failed: amqp:session:invalid-field sequencing 
> error, expected delivery-id 7, got 6}}
> (the numbers 7 and 6 differ per connection loss)
> In wireshark, using the attached tcpdump capture, I can see that every time 
> before the inter router connection is dropped, therw is a rejected 
> disposition with the message:
> {{Condition: qd:forbidden}}
> {{Description: Deliveries to a multicast address must be pre-settled}}
> The routers are connected as follows:
>  * router-0 -> router-1
>  * router-0 -> router-2
>  * router-1 -> router-2
> The routers are running as a docker container (debian stretch) on google 
> compute engine machines (every router on a separate node).
> Attached are:
>  * my qdrouter.conf (from one of the routers)
>  * a log snippet from router-0 at debug level from connection drop to 
> connection re-established to connection drop again.
>  * a tcpdump capture of the inter-router connection between router-0 and 
> router-1 during which several of the failures occur
> Versions:
>  * qpid-dispatch@1.0.1-rc1
>  * qpid-proton@0.20.0
>  
> [^qdrouterd.log]
> [^qdrouterd.conf]
> [^router.dump]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Updated] (DISPATCH-966) Qpid dispatch unstable inter-router connections

2018-04-13 Thread Marcel Meulemans (JIRA)

 [ 
https://issues.apache.org/jira/browse/DISPATCH-966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcel Meulemans updated DISPATCH-966:
--
Attachment: qdrouterd-unsettled-true.log
router-unsettled-true.dump

> Qpid dispatch unstable inter-router connections
> ---
>
> Key: DISPATCH-966
> URL: https://issues.apache.org/jira/browse/DISPATCH-966
> Project: Qpid Dispatch
>  Issue Type: Bug
>  Components: Routing Engine
>Affects Versions: 1.0.1
>Reporter: Marcel Meulemans
>Assignee: Ted Ross
>Priority: Major
> Attachments: qdrouterd-unsettled-true.log, qdrouterd.conf, 
> qdrouterd.log, router-unsettled-true.dump, router.dump
>
>
> I am running a three node fully connected mesh of dispatch routers with 1 
> attached clients and I am seeing some unstable inter-router connections (I am 
> sending around 1000 small, less than 1K, messages per second through the 
> network). The inter-router connections fail every so many seconds with the 
> message:
> {{Connection to router-2:55672 failed: amqp:session:invalid-field sequencing 
> error, expected delivery-id 7, got 6}}
> (the numbers 7 and 6 differ per connection loss)
> In wireshark, using the attached tcpdump capture, I can see that every time 
> before the inter router connection is dropped, therw is a rejected 
> disposition with the message:
> {{Condition: qd:forbidden}}
> {{Description: Deliveries to a multicast address must be pre-settled}}
> The routers are connected as follows:
>  * router-0 -> router-1
>  * router-0 -> router-2
>  * router-1 -> router-2
> The routers are running as a docker container (debian stretch) on google 
> compute engine machines (every router on a separate node).
> Attached are:
>  * my qdrouter.conf (from one of the routers)
>  * a log snippet from router-0 at debug level from connection drop to 
> connection re-established to connection drop again.
>  * a tcpdump capture of the inter-router connection between router-0 and 
> router-1 during which several of the failures occur
> Versions:
>  * qpid-dispatch@1.0.1-rc1
>  * qpid-proton@0.20.0
>  
> [^qdrouterd.log]
> [^qdrouterd.conf]
> [^router.dump]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Commented] (DISPATCH-966) Qpid dispatch unstable inter-router connections

2018-04-13 Thread Marcel Meulemans (JIRA)

[ 
https://issues.apache.org/jira/browse/DISPATCH-966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16437039#comment-16437039
 ] 

Marcel Meulemans commented on DISPATCH-966:
---

Added to the configuration and ran the test again several times. However now I 
see some things I did not expect; the network seems to come up correctly, but 
after a while it seems to fail in a weird way. The inter-router connections do 
not seems to drop anymore but routing via the network does not seem to work 
(i.e.ROUTER_LS (info) Computed next hops: {} and qdstat -n show only a single 
router). Maybe this is the issue unmasked by allowing unsettled multicasts?

I attached two more file:
 * the logs of router-0 (from router start until slightly after the network 
fails) at info level
 * a tcpdump to the inter router communication to an from router-0 (tcpdump -i 
eth0 tcp port 55672 -s 65535)

I hope this helps (the dump is fairly large, so I hope you can find any hidden 
needles).

-- 
Marcel

> Qpid dispatch unstable inter-router connections
> ---
>
> Key: DISPATCH-966
> URL: https://issues.apache.org/jira/browse/DISPATCH-966
> Project: Qpid Dispatch
>  Issue Type: Bug
>  Components: Routing Engine
>Affects Versions: 1.0.1
>Reporter: Marcel Meulemans
>Assignee: Ted Ross
>Priority: Major
> Attachments: qdrouterd-unsettled-true.log, qdrouterd.conf, 
> qdrouterd.log, router-unsettled-true.dump, router.dump
>
>
> I am running a three node fully connected mesh of dispatch routers with 1 
> attached clients and I am seeing some unstable inter-router connections (I am 
> sending around 1000 small, less than 1K, messages per second through the 
> network). The inter-router connections fail every so many seconds with the 
> message:
> {{Connection to router-2:55672 failed: amqp:session:invalid-field sequencing 
> error, expected delivery-id 7, got 6}}
> (the numbers 7 and 6 differ per connection loss)
> In wireshark, using the attached tcpdump capture, I can see that every time 
> before the inter router connection is dropped, therw is a rejected 
> disposition with the message:
> {{Condition: qd:forbidden}}
> {{Description: Deliveries to a multicast address must be pre-settled}}
> The routers are connected as follows:
>  * router-0 -> router-1
>  * router-0 -> router-2
>  * router-1 -> router-2
> The routers are running as a docker container (debian stretch) on google 
> compute engine machines (every router on a separate node).
> Attached are:
>  * my qdrouter.conf (from one of the routers)
>  * a log snippet from router-0 at debug level from connection drop to 
> connection re-established to connection drop again.
>  * a tcpdump capture of the inter-router connection between router-0 and 
> router-1 during which several of the failures occur
> Versions:
>  * qpid-dispatch@1.0.1-rc1
>  * qpid-proton@0.20.0
>  
> [^qdrouterd.log]
> [^qdrouterd.conf]
> [^router.dump]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Created] (DISPATCH-966) Qpid dispatch unstable inter-router connections

2018-04-12 Thread Marcel Meulemans (JIRA)
Marcel Meulemans created DISPATCH-966:
-

 Summary: Qpid dispatch unstable inter-router connections
 Key: DISPATCH-966
 URL: https://issues.apache.org/jira/browse/DISPATCH-966
 Project: Qpid Dispatch
  Issue Type: Bug
  Components: Routing Engine
Affects Versions: 1.0.1
Reporter: Marcel Meulemans
 Attachments: qdrouterd.conf, qdrouterd.log, router.dump

I am running a three node fully connected mesh of dispatch routers with 1 
attached clients and I am seeing some unstable inter-router connections (I am 
sending around 1000 small, less than 1K, messages per second through the 
network). The inter-router connections fail every so many seconds with the 
message:

{{Connection to router-2:55672 failed: amqp:session:invalid-field sequencing 
error, expected delivery-id 7, got 6}}

(the numbers 7 and 6 differ per connection loss)

In wireshark, using the attached tcpdump capture, I can see that every time 
before the inter router connection is dropped, therw is a rejected disposition 
with the message:

{{Condition: qd:forbidden}}
{{Description: Deliveries to a multicast address must be pre-settled}}

The routers are connected as follows:
 * router-0 -> router-1
 * router-0 -> router-2
 * router-1 -> router-2

The routers are running as a docker container (debian stretch) on google 
compute engine machines (every router on a separate node).

Attached are:
 * my qdrouter.conf (from one of the routers)
 * a log snippet from router-0 at debug level from connection drop to 
connection re-established to connection drop again.
 * a tcpdump capture of the inter-router connection between router-0 and 
router-1 during which several of the failures occur

Versions:
 * qpid-dispatch@1.0.1-rc1
 * qpid-proton@0.20.0

 

[^qdrouterd.log]

[^qdrouterd.conf]

[^router.dump]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org