[jira] Created: (QPID-2995) JMS Client does not set "redelivered" flag on retransmitted messages

2011-01-10 Thread Rajith Attapattu (JIRA)
JMS Client does not set "redelivered" flag on retransmitted messages


 Key: QPID-2995
 URL: https://issues.apache.org/jira/browse/QPID-2995
 Project: Qpid
  Issue Type: Bug
  Components: Java Client
Affects Versions: 0.8, 0.7, 0.6
Reporter: Rajith Attapattu
Assignee: Rajith Attapattu
Priority: Minor
 Fix For: Future


When a JMS client fails over it will replay in doubt messages. These replayed 
messages should have the redelivered flag set on them.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
Apache Qpid - AMQP Messaging Implementation
Project:  http://qpid.apache.org
Use/Interact: mailto:dev-subscr...@qpid.apache.org



[jira] Created: (QPID-2994) transactions atomicity violated by 'transparent' failover

2011-01-10 Thread Rajith Attapattu (JIRA)
transactions atomicity violated by 'transparent' failover
-

 Key: QPID-2994
 URL: https://issues.apache.org/jira/browse/QPID-2994
 Project: Qpid
  Issue Type: Bug
  Components: Java Client
Affects Versions: 0.8, 0.7, 0.6
Reporter: Rajith Attapattu
Assignee: Rajith Attapattu
 Fix For: Future


The messages published within a batch at the point the connection failsover 
appear to be replayed outside of any transaction.

Steps to Reproduce:
1. start transactional session on failover enabled connection
2. send batches of messages in transactions
3. kill the cluster node the client is connected to, to trigger failover mid
transaction

This happens due to the lower layer replaying unacked messages upon resuming 
the connection.
Message replay should not happen on a transacted session as there is no benefit 
of doing so.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
Apache Qpid - AMQP Messaging Implementation
Project:  http://qpid.apache.org
Use/Interact: mailto:dev-subscr...@qpid.apache.org



[jira] Commented: (QPID-2992) Cluster failing to resurrect durable static route depending on order of shutdown

2011-01-10 Thread Mark Moseley (JIRA)

[ 
https://issues.apache.org/jira/browse/QPID-2992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12979891#action_12979891
 ] 

Mark Moseley commented on QPID-2992:


I also rewrote the script to do a B1->B2->B2->B1 shutdown/startup sequence 
first (the binding was visible after that), then do a B2->B1->B1->B2 stop/start 
and the binding wasn't there. Maybe it get s a single freebie in a super clean 
cluster?

I had originally posted to the list since I figured I was probably doing 
something wrong, so there could be some conceptual problem on my part, i.e. 
maybe it's not supposed to work like I'm expecting.

> Cluster failing to resurrect durable static route depending on order of 
> shutdown
> 
>
> Key: QPID-2992
> URL: https://issues.apache.org/jira/browse/QPID-2992
> Project: Qpid
>  Issue Type: Bug
>  Components: C++ Broker, C++ Clustering
>Affects Versions: 0.8
> Environment: Debian Linux Squeeze, 32-bit, kernel 2.6.36.2, Dell 
> Poweredge 1950s. Corosync==1.3.0, Openais==1.1.4
>Reporter: Mark Moseley
>Assignee: Alan Conway
> Attachments: cluster-fed.sh, error
>
>
> I've got a 2-node qpid test cluster at each of 2 datacenters, which are 
> federated together with a single durable static route between each. Qpid is 
> version 0.8. Corosync and openais are stock Squeeze (1.2.1-3 and 1.1.2-2, 
> respectively). OS is Squeeze, 32-bit, on Dell Poweredge 1950s, kernel 2.6.36. 
> The static route is durable and is set up over SSL (but I can replicate as 
> well with non-SSL). I've tried to normalize the hostnames below to make 
> things clearer; hopefully I didn't mess anything up.
> Given two clusters, cluster A (consisting of hosts A1 and A2) and cluster B 
> (with B1 and B2), I've got a static exchange route from A1 to B1, as well as 
> another from B1 to A1. Federation is working correctly, so I can send a 
> message on A2 and have it successfully retrieved on B2. The exchange local to 
> cluster A is walmyex1; the local exchange for B is bosmyex1.
> If I shut down the cluster in this order: B2, then B1, and start back up with 
> B1, B2, the static route route fails to get recreated. That is, on A1/A2, 
> looking at the bindings, exchange 'bosmyex1' does not get re-bound to cluster 
> B; the only output for it in "qpid-config exchanges --bindings" is just:
> 
> Exchange 'bosmyex1' (direct)
> 
> If however I shut the cluster down in this order: B1, then B2, and start B2, 
> then B1, the static route gets re-bound. The output then is:
> 
> Exchange 'bosmyex1' (direct)
> bind [unix.boston.cust] => 
> bridge_queue_1_8870523d-2286-408e-b5b5-50d53db2fa61
> 
> and I can message over the federated link with no further modification. Prior 
> to a few minutes ago, I was seeing this with the Squeeze stock openais==1.1.2 
> and corosync==1.2.1. In debugging this, I've upgraded both to the latest 
> versions with no change.
> I can replicate this every time I try. These are just test clusters, so I 
> don't have any other activity going on on them, or any other 
> exchanges/queues. My steps:
> On all boxes in cluster A and B:
> * Kill the qpidd if it's running and delete all existing store files, i.e. 
> contents of /var/lib/qpid/
> On host A1 in cluster A (I'm leaving out the -a user/t...@host stuff):
> * Start up qpid
> * qpid-config add exchange direct bosmyex1 --durable
> * qpid-config add exchange direct walmyex1 --durable
> * qpid-config add queue walmyq1 --durable
> * qpid-config bind walmyex1 walmyq1 unix.waltham.cust
> On host B1 in cluster B:
> * qpid-config add exchange direct bosmyex1 --durable
> * qpid-config add exchange direct walmyex1 --durable
> * qpid-config add queue bosmyq1 --durable
> * qpid-config bind bosmyex1 bosmyq1 unix.boston.cust
> On cluster A:
> * Start other member of cluster, A2
> * qpid-route route add amqps://user/p...@hosta1:5671 
> amqps://user/p...@hostb1:5671 walmyex1 unix.waltham.cust -d
> On cluster B:
> * Start other member of cluster, B2
> * qpid-route route add amqps://user/p...@hostb1:5671 
> amqps://user/p...@hosta1:5671 bosmyex1 unix.boston.cust -d
> On either cluster:
> * Check "qpid-config exchanges --bindings" to make sure bindings are correct 
> for remote exchanges
> * To see correct behaviour, stop cluster in the order B1->B2, or A1->A2, 
> start cluster back up, check bindings.
> * To see broken behaviour, stop cluster in the order B2->B1, or A2->A1, start 
> cluster back up, check bindings.
> This is a test cluster, so I'm free to do anything with it, debugging-wise, 
> that would be useful. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
Apache Qpid - AM

[jira] Updated: (QPID-2992) Cluster failing to resurrect durable static route depending on order of shutdown

2011-01-10 Thread Mark Moseley (JIRA)

 [ 
https://issues.apache.org/jira/browse/QPID-2992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Moseley updated QPID-2992:
---

Attachment: error

This is the output from the script when it does another round of stop/starts 
with the order flipped the second time around.

> Cluster failing to resurrect durable static route depending on order of 
> shutdown
> 
>
> Key: QPID-2992
> URL: https://issues.apache.org/jira/browse/QPID-2992
> Project: Qpid
>  Issue Type: Bug
>  Components: C++ Broker, C++ Clustering
>Affects Versions: 0.8
> Environment: Debian Linux Squeeze, 32-bit, kernel 2.6.36.2, Dell 
> Poweredge 1950s. Corosync==1.3.0, Openais==1.1.4
>Reporter: Mark Moseley
>Assignee: Alan Conway
> Attachments: cluster-fed.sh, error
>
>
> I've got a 2-node qpid test cluster at each of 2 datacenters, which are 
> federated together with a single durable static route between each. Qpid is 
> version 0.8. Corosync and openais are stock Squeeze (1.2.1-3 and 1.1.2-2, 
> respectively). OS is Squeeze, 32-bit, on Dell Poweredge 1950s, kernel 2.6.36. 
> The static route is durable and is set up over SSL (but I can replicate as 
> well with non-SSL). I've tried to normalize the hostnames below to make 
> things clearer; hopefully I didn't mess anything up.
> Given two clusters, cluster A (consisting of hosts A1 and A2) and cluster B 
> (with B1 and B2), I've got a static exchange route from A1 to B1, as well as 
> another from B1 to A1. Federation is working correctly, so I can send a 
> message on A2 and have it successfully retrieved on B2. The exchange local to 
> cluster A is walmyex1; the local exchange for B is bosmyex1.
> If I shut down the cluster in this order: B2, then B1, and start back up with 
> B1, B2, the static route route fails to get recreated. That is, on A1/A2, 
> looking at the bindings, exchange 'bosmyex1' does not get re-bound to cluster 
> B; the only output for it in "qpid-config exchanges --bindings" is just:
> 
> Exchange 'bosmyex1' (direct)
> 
> If however I shut the cluster down in this order: B1, then B2, and start B2, 
> then B1, the static route gets re-bound. The output then is:
> 
> Exchange 'bosmyex1' (direct)
> bind [unix.boston.cust] => 
> bridge_queue_1_8870523d-2286-408e-b5b5-50d53db2fa61
> 
> and I can message over the federated link with no further modification. Prior 
> to a few minutes ago, I was seeing this with the Squeeze stock openais==1.1.2 
> and corosync==1.2.1. In debugging this, I've upgraded both to the latest 
> versions with no change.
> I can replicate this every time I try. These are just test clusters, so I 
> don't have any other activity going on on them, or any other 
> exchanges/queues. My steps:
> On all boxes in cluster A and B:
> * Kill the qpidd if it's running and delete all existing store files, i.e. 
> contents of /var/lib/qpid/
> On host A1 in cluster A (I'm leaving out the -a user/t...@host stuff):
> * Start up qpid
> * qpid-config add exchange direct bosmyex1 --durable
> * qpid-config add exchange direct walmyex1 --durable
> * qpid-config add queue walmyq1 --durable
> * qpid-config bind walmyex1 walmyq1 unix.waltham.cust
> On host B1 in cluster B:
> * qpid-config add exchange direct bosmyex1 --durable
> * qpid-config add exchange direct walmyex1 --durable
> * qpid-config add queue bosmyq1 --durable
> * qpid-config bind bosmyex1 bosmyq1 unix.boston.cust
> On cluster A:
> * Start other member of cluster, A2
> * qpid-route route add amqps://user/p...@hosta1:5671 
> amqps://user/p...@hostb1:5671 walmyex1 unix.waltham.cust -d
> On cluster B:
> * Start other member of cluster, B2
> * qpid-route route add amqps://user/p...@hostb1:5671 
> amqps://user/p...@hosta1:5671 bosmyex1 unix.boston.cust -d
> On either cluster:
> * Check "qpid-config exchanges --bindings" to make sure bindings are correct 
> for remote exchanges
> * To see correct behaviour, stop cluster in the order B1->B2, or A1->A2, 
> start cluster back up, check bindings.
> * To see broken behaviour, stop cluster in the order B2->B1, or A2->A1, start 
> cluster back up, check bindings.
> This is a test cluster, so I'm free to do anything with it, debugging-wise, 
> that would be useful. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
Apache Qpid - AMQP Messaging Implementation
Project:  http://qpid.apache.org
Use/Interact: mailto:dev-subscr...@qpid.apache.org



[jira] Commented: (QPID-2992) Cluster failing to resurrect durable static route depending on order of shutdown

2011-01-10 Thread Mark Moseley (JIRA)

[ 
https://issues.apache.org/jira/browse/QPID-2992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12979883#action_12979883
 ] 

Mark Moseley commented on QPID-2992:


On one of the nodes in question. I tried reproducing with this script and it 
seemed to work perfectly. I added authentication as well, and it continued to 
work ok. Your test script is pretty much exactly what I'm doing too.

I wonder though (and I'm just trying to think of reasons why it'd act 
differently in the two scenarios) can you try this out on 4 separate nodes, 
even if virtualized? Though when I reproduce this on the physical nodes, with 
debug logging turned on, it doesn't mention the node on the other side of the 
federated link, whereas when it does work, I see this in the logs:

2011-01-10 19:35:12 debug Known hosts for peer of inter-broker link: 
amqp:tcp:10.1.58.3:5672 amqp:tcp:10.1.58.4:5672 

Running through this again today, I noticed that sometimes, with a completely 
fresh cluster, the connection in a B2->B1->B1->B2 shutdown/startup does work. 
But then I do it again and it doesn't. Or if I do the opposite order it breaks 
as well.

I just modified your script so that after the first round of 
stop/start/check-binding, it flips the order and shuts them down again and 
starts them up -- and yes, I realize this is the opposite order from my ticket 
:) -- and re-checks bindings and they're gone. I'm attaching the output of your 
script.

(Just for clarification, 10.1.58.3==exp01==A1, 10.1.58.4==exp02==A2, 
10.20.58.1==bosmsg01==B1, and 10.20.58.2==bosmsg02==B2. I've been trying to 
regex the hostnames so you guys didn't have to deal with following my 
hostnames, but if you guys prefer, I don't mind just using the real names.)


> Cluster failing to resurrect durable static route depending on order of 
> shutdown
> 
>
> Key: QPID-2992
> URL: https://issues.apache.org/jira/browse/QPID-2992
> Project: Qpid
>  Issue Type: Bug
>  Components: C++ Broker, C++ Clustering
>Affects Versions: 0.8
> Environment: Debian Linux Squeeze, 32-bit, kernel 2.6.36.2, Dell 
> Poweredge 1950s. Corosync==1.3.0, Openais==1.1.4
>Reporter: Mark Moseley
>Assignee: Alan Conway
> Attachments: cluster-fed.sh
>
>
> I've got a 2-node qpid test cluster at each of 2 datacenters, which are 
> federated together with a single durable static route between each. Qpid is 
> version 0.8. Corosync and openais are stock Squeeze (1.2.1-3 and 1.1.2-2, 
> respectively). OS is Squeeze, 32-bit, on Dell Poweredge 1950s, kernel 2.6.36. 
> The static route is durable and is set up over SSL (but I can replicate as 
> well with non-SSL). I've tried to normalize the hostnames below to make 
> things clearer; hopefully I didn't mess anything up.
> Given two clusters, cluster A (consisting of hosts A1 and A2) and cluster B 
> (with B1 and B2), I've got a static exchange route from A1 to B1, as well as 
> another from B1 to A1. Federation is working correctly, so I can send a 
> message on A2 and have it successfully retrieved on B2. The exchange local to 
> cluster A is walmyex1; the local exchange for B is bosmyex1.
> If I shut down the cluster in this order: B2, then B1, and start back up with 
> B1, B2, the static route route fails to get recreated. That is, on A1/A2, 
> looking at the bindings, exchange 'bosmyex1' does not get re-bound to cluster 
> B; the only output for it in "qpid-config exchanges --bindings" is just:
> 
> Exchange 'bosmyex1' (direct)
> 
> If however I shut the cluster down in this order: B1, then B2, and start B2, 
> then B1, the static route gets re-bound. The output then is:
> 
> Exchange 'bosmyex1' (direct)
> bind [unix.boston.cust] => 
> bridge_queue_1_8870523d-2286-408e-b5b5-50d53db2fa61
> 
> and I can message over the federated link with no further modification. Prior 
> to a few minutes ago, I was seeing this with the Squeeze stock openais==1.1.2 
> and corosync==1.2.1. In debugging this, I've upgraded both to the latest 
> versions with no change.
> I can replicate this every time I try. These are just test clusters, so I 
> don't have any other activity going on on them, or any other 
> exchanges/queues. My steps:
> On all boxes in cluster A and B:
> * Kill the qpidd if it's running and delete all existing store files, i.e. 
> contents of /var/lib/qpid/
> On host A1 in cluster A (I'm leaving out the -a user/t...@host stuff):
> * Start up qpid
> * qpid-config add exchange direct bosmyex1 --durable
> * qpid-config add exchange direct walmyex1 --durable
> * qpid-config add queue walmyq1 --durable
> * qpid-config bind walmyex1 walmyq1 unix.waltham.cust
> On host B1 in cluster B:
> * qpid-config add exchange direct bosmyex1 --durable
> * qpid-config add exchange direct walmyex1 --durabl

Re: svn commit: r1056420 - in /qpid/trunk/qpid/cpp/bindings: qmf/python/Makefile.am qmf/ruby/Makefile.am qmf2/python/Makefile.am qpid/python/Makefile.am qpid/ruby/Makefile.am

2011-01-10 Thread Alan Conway

On 01/10/2011 01:43 PM, Ted Ross wrote:

On 01/10/2011 01:04 PM, Gordon Sim wrote:

On 01/10/2011 02:44 PM, Ted Ross wrote:

Devs,

This commit causes the automake build to install Python files (as part
of wrapped interfaces) during make-install. You may see a build error
like the following:

bindings/qpid/python/Makefile.am:39: required file
`build-aux/py-compile' not found
bindings/qpid/python/Makefile.am:39: `automake --add-missing' can
install `py-compile'

To fix this, type 'automake --add-missing' from the qpid/cpp directory.
This will cause automake to install the missing py-compile module in
build-aux.


Is that something we should add to the bootstrap script?


I would think so. I was surprised that it wasn't already in there.



Done, in r1057342

-
Apache Qpid - AMQP Messaging Implementation
Project:  http://qpid.apache.org
Use/Interact: mailto:dev-subscr...@qpid.apache.org



[jira] Updated: (QPID-2993) Federated source-local links crash remotely federated cluster member on local cluster startup

2011-01-10 Thread Ken Giusti (JIRA)

 [ 
https://issues.apache.org/jira/browse/QPID-2993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ken Giusti updated QPID-2993:
-

Attachment: cluster-fed-src.sh

Appears to reproduce the described crash, using the current trunk.

> Federated source-local links crash remotely federated cluster member on local 
> cluster startup
> -
>
> Key: QPID-2993
> URL: https://issues.apache.org/jira/browse/QPID-2993
> Project: Qpid
>  Issue Type: Bug
>  Components: C++ Broker, C++ Clustering
>Affects Versions: 0.8
> Environment: Debian Linux Squeeze, 32-bit, kernel 2.6.36.2, Dell 
> Poweredge 1950s. Corosync==1.3.0, Openais==1.1.4
>Reporter: Mark Moseley
>Assignee: Alan Conway
> Attachments: cluster-fed-src.sh
>
>
> This is related to JIRA 2992 that I opened, but this is for source-local 
> routes. Given the same setup as in JIRA 2992 but using source-local routes 
> (and obviously with the exchanges switched accordingly in the qpid-route 
> statements), i.e. cluster A and cluster B with the routes between A1<->B1, 
> when cluster B shuts down in the order B2->B1 and starts back up, the static 
> routes are not correctly re-bound on cluster A's side. However if cluster B 
> is shut down in the order B1->B2 and started back up, the route is correctly 
> created and works. However in the non-functioning case (B2->B1, or A2->A1), 
> there is an additional side-effect: on node A2, qpidd crashes with the 
> following error (cluster A is called 'walclust', B is bosclust):
> 2011-01-07 18:57:35 error Channel exception: not-attached: Channel 1 is not 
> attached (qpid/amqp_0_10/SessionHandler.cpp:39)
> 2011-01-07 18:57:35 critical cluster(102.0.0.0:13650 READY/error) local error 
> 2030 did not occur on member 101.0.0.0:9920: not-attached: Channel 1 is not 
> attached (qpid/amqp_0_10/SessionHandler.cpp:39)
> 2011-01-07 18:57:35 critical Error delivering frames: local error did not 
> occur on all cluster members : not-attached: Channel 1 is not attached 
> (qpid/amqp_0_10/SessionHandler.cpp:39) (qpid/cluster/ErrorCheck.cpp:89)
> 2011-01-07 18:57:35 notice cluster(102.0.0.0:13650 LEFT/error) leaving 
> cluster walclust
> 2011-01-07 18:57:35 notice Shut down
> This happens on both sides of the cluster, so it's not limited to one or the 
> other. This crash does *not* occur in the A1->A2/B1->B2 test (i.e. the test 
> where the route is re-bound correctly). I can cause this to reoccur pretty 
> much every time. I've been resetting the cluster completely to a new state 
> between each test. Occasionally in the B2->B1 test, A1 will also crash with 
> the same error (and vice versa for A2->A1 for node B1), though most of the 
> time, it's A2/B2 that crashes.
> I was getting this same behaviour prior to upgrading corosync/openais as 
> well. Previously I was using the stock Squeeze versions of corosync==1.2.1 
> and openais==1.1.2. The results are the same with corosync=1.3.0 and 
> openais==1.1.4.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
Apache Qpid - AMQP Messaging Implementation
Project:  http://qpid.apache.org
Use/Interact: mailto:dev-subscr...@qpid.apache.org



[jira] Updated: (QPID-2992) Cluster failing to resurrect durable static route depending on order of shutdown

2011-01-10 Thread Ken Giusti (JIRA)

 [ 
https://issues.apache.org/jira/browse/QPID-2992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ken Giusti updated QPID-2992:
-

Attachment: cluster-fed.sh

Hi Mark,  

I'm trying to reproduce this problem on my Fedora 14 box, without luck.  Can 
you try the attached script - you'll have to modify it a bit to find your 
cluster and message store libraries - and let me know if it causes the problem 
for you?   

thanks,

-K

> Cluster failing to resurrect durable static route depending on order of 
> shutdown
> 
>
> Key: QPID-2992
> URL: https://issues.apache.org/jira/browse/QPID-2992
> Project: Qpid
>  Issue Type: Bug
>  Components: C++ Broker, C++ Clustering
>Affects Versions: 0.8
> Environment: Debian Linux Squeeze, 32-bit, kernel 2.6.36.2, Dell 
> Poweredge 1950s. Corosync==1.3.0, Openais==1.1.4
>Reporter: Mark Moseley
>Assignee: Alan Conway
> Attachments: cluster-fed.sh
>
>
> I've got a 2-node qpid test cluster at each of 2 datacenters, which are 
> federated together with a single durable static route between each. Qpid is 
> version 0.8. Corosync and openais are stock Squeeze (1.2.1-3 and 1.1.2-2, 
> respectively). OS is Squeeze, 32-bit, on Dell Poweredge 1950s, kernel 2.6.36. 
> The static route is durable and is set up over SSL (but I can replicate as 
> well with non-SSL). I've tried to normalize the hostnames below to make 
> things clearer; hopefully I didn't mess anything up.
> Given two clusters, cluster A (consisting of hosts A1 and A2) and cluster B 
> (with B1 and B2), I've got a static exchange route from A1 to B1, as well as 
> another from B1 to A1. Federation is working correctly, so I can send a 
> message on A2 and have it successfully retrieved on B2. The exchange local to 
> cluster A is walmyex1; the local exchange for B is bosmyex1.
> If I shut down the cluster in this order: B2, then B1, and start back up with 
> B1, B2, the static route route fails to get recreated. That is, on A1/A2, 
> looking at the bindings, exchange 'bosmyex1' does not get re-bound to cluster 
> B; the only output for it in "qpid-config exchanges --bindings" is just:
> 
> Exchange 'bosmyex1' (direct)
> 
> If however I shut the cluster down in this order: B1, then B2, and start B2, 
> then B1, the static route gets re-bound. The output then is:
> 
> Exchange 'bosmyex1' (direct)
> bind [unix.boston.cust] => 
> bridge_queue_1_8870523d-2286-408e-b5b5-50d53db2fa61
> 
> and I can message over the federated link with no further modification. Prior 
> to a few minutes ago, I was seeing this with the Squeeze stock openais==1.1.2 
> and corosync==1.2.1. In debugging this, I've upgraded both to the latest 
> versions with no change.
> I can replicate this every time I try. These are just test clusters, so I 
> don't have any other activity going on on them, or any other 
> exchanges/queues. My steps:
> On all boxes in cluster A and B:
> * Kill the qpidd if it's running and delete all existing store files, i.e. 
> contents of /var/lib/qpid/
> On host A1 in cluster A (I'm leaving out the -a user/t...@host stuff):
> * Start up qpid
> * qpid-config add exchange direct bosmyex1 --durable
> * qpid-config add exchange direct walmyex1 --durable
> * qpid-config add queue walmyq1 --durable
> * qpid-config bind walmyex1 walmyq1 unix.waltham.cust
> On host B1 in cluster B:
> * qpid-config add exchange direct bosmyex1 --durable
> * qpid-config add exchange direct walmyex1 --durable
> * qpid-config add queue bosmyq1 --durable
> * qpid-config bind bosmyex1 bosmyq1 unix.boston.cust
> On cluster A:
> * Start other member of cluster, A2
> * qpid-route route add amqps://user/p...@hosta1:5671 
> amqps://user/p...@hostb1:5671 walmyex1 unix.waltham.cust -d
> On cluster B:
> * Start other member of cluster, B2
> * qpid-route route add amqps://user/p...@hostb1:5671 
> amqps://user/p...@hosta1:5671 bosmyex1 unix.boston.cust -d
> On either cluster:
> * Check "qpid-config exchanges --bindings" to make sure bindings are correct 
> for remote exchanges
> * To see correct behaviour, stop cluster in the order B1->B2, or A1->A2, 
> start cluster back up, check bindings.
> * To see broken behaviour, stop cluster in the order B2->B1, or A2->A1, start 
> cluster back up, check bindings.
> This is a test cluster, so I'm free to do anything with it, debugging-wise, 
> that would be useful. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
Apache Qpid - AMQP Messaging Implementation
Project:  http://qpid.apache.org
Use/Interact: mailto:dev-subscr...@qpid.apache.org



Re: svn commit: r1056420 - in /qpid/trunk/qpid/cpp/bindings: qmf/python/Makefile.am qmf/ruby/Makefile.am qmf2/python/Makefile.am qpid/python/Makefile.am qpid/ruby/Makefile.am

2011-01-10 Thread Ted Ross

On 01/10/2011 01:04 PM, Gordon Sim wrote:

On 01/10/2011 02:44 PM, Ted Ross wrote:

Devs,

This commit causes the automake build to install Python files (as part
of wrapped interfaces) during make-install. You may see a build error
like the following:

bindings/qpid/python/Makefile.am:39: required file
`build-aux/py-compile' not found
bindings/qpid/python/Makefile.am:39: `automake --add-missing' can
install `py-compile'

To fix this, type 'automake --add-missing' from the qpid/cpp directory.
This will cause automake to install the missing py-compile module in
build-aux.


Is that something we should add to the bootstrap script?


I would think so.  I was surprised that it wasn't already in there.


-
Apache Qpid - AMQP Messaging Implementation
Project:  http://qpid.apache.org
Use/Interact: mailto:dev-subscr...@qpid.apache.org



Re: svn commit: r1056420 - in /qpid/trunk/qpid/cpp/bindings: qmf/python/Makefile.am qmf/ruby/Makefile.am qmf2/python/Makefile.am qpid/python/Makefile.am qpid/ruby/Makefile.am

2011-01-10 Thread Gordon Sim

On 01/10/2011 02:44 PM, Ted Ross wrote:

Devs,

This commit causes the automake build to install Python files (as part
of wrapped interfaces) during make-install. You may see a build error
like the following:

bindings/qpid/python/Makefile.am:39: required file
`build-aux/py-compile' not found
bindings/qpid/python/Makefile.am:39: `automake --add-missing' can
install `py-compile'

To fix this, type 'automake --add-missing' from the qpid/cpp directory.
This will cause automake to install the missing py-compile module in
build-aux.


Is that something we should add to the bootstrap script?

-
Apache Qpid - AMQP Messaging Implementation
Project:  http://qpid.apache.org
Use/Interact: mailto:dev-subscr...@qpid.apache.org



Re: Qpid Improvement Process

2011-01-10 Thread jross

On Mon, 10 Jan 2011, Robert Godfrey wrote:

Hi, Rob.


Hi Justin,

On 6 January 2011 15:42, Justin Ross  wrote:


Hi, everyone.

I'd like to propose a new way to communicate about the major changes going
into our releases.  The goal is to help us make decisions early in the
process, coordinate when a change in one place impacts another, and engender
a broader sense of direction.  It should also help to document the work that
we do.



So, I have a few questions / comments about this.

Firstly I think we need to be clear whether QIPs are describing an end state
(an implied specification), or the act of making the change to the codebase
to achieve the end result (which is what a JIRA is for - surely)?.


QIPs are mostly end-state oriented, but they may include notes about 
implementation insofar as they help us understand a features impact on a 
release.



Secondly, I think we need to have a better definition of when a change
should be a QIP or not: what constitutes "major"? does a purely internal
refactoring count? what about removing an unloved component?


I'm not entirely decided on what the criteria should be, and I'm not yet 
convinced they need to be strict.  We can always look at an incoming 
change and say, "that seems big, please submit a QIP".


To my sensibilities, an internal refactoring would not call for a QIP. 
Removing an unloved component might, depending on impact.  I can imagine a 
case where the developers hate component X and want to remove it, but the 
users like it.  In that case, we could use the higher-level release 
decision making that QIPs require.



Next I think we need to understand better how QIPs would relate to JIRAs.
I'm firmly of the opinion that nothing should be getting in to the codebase
without a JIRA that explains the reason for the change.  That doesn't
necessarily remove the need for a system for recording proposals to change
the "definition" of Qpid... but if I had to chose either a "proposals"
system or a change/defect tracker, I'd choose the latter.


Eeeks!  I definitely don't wish to propose we have one or the other.  More 
on this below.



I certainly see a lot of value in tracking "proposals" separately to actual
work tasks (which is what we predominantly have in JIRA right now).  However
I think it is work tasks that make up a release, so it is those (rather than
QIPs) which would be scheduled and tracked.  I'm less sure about using svn
as tool for storing in-process QIPs - it seems unnatural.  Using a wiki or a
workflow (such as JIRA the tool rather than our current instance of it)


I'm pretty indifferent on the question of where draft QIPs should go.  I 
do, however, think of QIPs as documentation, so once accepted I tend to 
think they should go next to our other project documentation at qpid/doc.



seems more obvious.  There is utility in capturing completed QIPs as part of
our documentation if they define an interface or feature that is to be
exposed to our users (or possibly define why a specific feature or interface
has not been defined (or has been removed)). Wiki and JIRA both preserve a
similar level of source control and history to svn.



To do this, I've imitated an approach that's been used in other community
software projects.  The Python project, for instance, uses Python
Enhancement Proposals (PEPs) to review important changes to the Python
language [1].

I've attempted to produce something very similar here and tie it into our
release process.  It's my hope that during the 0.10 cycle some contributors
will opt to describe their major changes using the QIP process.  This is,
however, entirely optional.



As above, I'm not sure that QIPs should be what releases are based on.  If a
given QIP can be broken into two (or more) discrete pieces of work which can
then be defined as JIRAs, then it would seem to me that at the point the
release was being cute, then it could be OK (depending on the precise
details of the changes) for it to go out with only some, but not all, of the
JIRAs completed.


It depends on what you mean by "what releases are based on".  I am 
explicitly proposing that at some point in the future, we the community 
use QIP review to signal the acceptance or rejection of major work early 
in a release cycle.


On the tail end, however, I agree.  When we are nearing the end of a
release, I'm all in favor of triaging the JIRAs in favor of a discipined, 
regular release schedule (with no inadvertent regressions).



The attached QIP 1 and QIP template should provide an overview of how this
might work.  This is very much in an early and unrefined form, so I hope
you'll take a look and share your thoughts.



As a group I think we need to do a lot more work on trying to define our
goals (whether for an individual release, or more generally in terms of what
we are trying to make Qpid be). And we need to be much better at
understanding how each release is getting us closer to our goal.

As an example, what is it that we are tryin

Re: Qpid Improvement Process

2011-01-10 Thread Robert Godfrey
Hi Justin,

On 6 January 2011 15:42, Justin Ross  wrote:

> Hi, everyone.
>
> I'd like to propose a new way to communicate about the major changes going
> into our releases.  The goal is to help us make decisions early in the
> process, coordinate when a change in one place impacts another, and engender
> a broader sense of direction.  It should also help to document the work that
> we do.
>
>
So, I have a few questions / comments about this.

Firstly I think we need to be clear whether QIPs are describing an end state
(an implied specification), or the act of making the change to the codebase
to achieve the end result (which is what a JIRA is for - surely)?.
Secondly, I think we need to have a better definition of when a change
should be a QIP or not: what constitutes "major"? does a purely internal
refactoring count? what about removing an unloved component?
Next I think we need to understand better how QIPs would relate to JIRAs.
 I'm firmly of the opinion that nothing should be getting in to the codebase
without a JIRA that explains the reason for the change.  That doesn't
necessarily remove the need for a system for recording proposals to change
the "definition" of Qpid... but if I had to chose either a "proposals"
system or a change/defect tracker, I'd choose the latter.

I certainly see a lot of value in tracking "proposals" separately to actual
work tasks (which is what we predominantly have in JIRA right now).  However
I think it is work tasks that make up a release, so it is those (rather than
QIPs) which would be scheduled and tracked.  I'm less sure about using svn
as tool for storing in-process QIPs - it seems unnatural.  Using a wiki or a
workflow (such as JIRA the tool rather than our current instance of it)
seems more obvious.  There is utility in capturing completed QIPs as part of
our documentation if they define an interface or feature that is to be
exposed to our users (or possibly define why a specific feature or interface
has not been defined (or has been removed)). Wiki and JIRA both preserve a
similar level of source control and history to svn.


> To do this, I've imitated an approach that's been used in other community
> software projects.  The Python project, for instance, uses Python
> Enhancement Proposals (PEPs) to review important changes to the Python
> language [1].
>
> I've attempted to produce something very similar here and tie it into our
> release process.  It's my hope that during the 0.10 cycle some contributors
> will opt to describe their major changes using the QIP process.  This is,
> however, entirely optional.
>
>
As above, I'm not sure that QIPs should be what releases are based on.  If a
given QIP can be broken into two (or more) discrete pieces of work which can
then be defined as JIRAs, then it would seem to me that at the point the
release was being cute, then it could be OK (depending on the precise
details of the changes) for it to go out with only some, but not all, of the
JIRAs completed.



> The attached QIP 1 and QIP template should provide an overview of how this
> might work.  This is very much in an early and unrefined form, so I hope
> you'll take a look and share your thoughts.
>
>
As a group I think we need to do a lot more work on trying to define our
goals (whether for an individual release, or more generally in terms of what
we are trying to make Qpid be). And we need to be much better at
understanding how each release is getting us closer to our goal.

As an example, what is it that we are trying to achieve in our next (0.10)
release?  And how will it help our end users?

It is against these goals that I think QIPs need to be evaluated.  I think
the process of planning release priorities may be in terms of QIPs, but more
likely in terms of a mix of QIPS and JIRAs... and every piece of work should
have a JIRA for it before it gets checked in.  If we use the JIRA system
properly we can get a much better overview of the state of the trunk and
(eventually) the release itself.

(As an aside, I'm not sure "Qpid Improvement Process" is a good name to give
these things, they seem more like proposals and an "Improvement Proposal"
still seems more like a task - i.e. a JIRA... )

-- Rob


Re: svn commit: r1056420 - in /qpid/trunk/qpid/cpp/bindings: qmf/python/Makefile.am qmf/ruby/Makefile.am qmf2/python/Makefile.am qpid/python/Makefile.am qpid/ruby/Makefile.am

2011-01-10 Thread Ted Ross

Devs,

This commit causes the automake build to install Python files (as part 
of wrapped interfaces) during make-install.  You may see a build error 
like the following:


 bindings/qpid/python/Makefile.am:39: required file 
`build-aux/py-compile' not found
 bindings/qpid/python/Makefile.am:39:   `automake --add-missing' can 
install `py-compile'


To fix this, type 'automake --add-missing' from the qpid/cpp directory.  
This will cause automake to install the missing py-compile module in 
build-aux.


-Ted


On 01/07/2011 12:55 PM, tr...@apache.org wrote:

Author: tross
Date: Fri Jan  7 17:55:07 2011
New Revision: 1056420

URL: http://svn.apache.org/viewvc?rev=1056420&view=rev
Log:
Cleaned up the makefiles for the Swig-generated bindings.
   1) Suppression of some warnings
   2) Proper installation of artifacts in "make install"

Modified:
 qpid/trunk/qpid/cpp/bindings/qmf/python/Makefile.am
 qpid/trunk/qpid/cpp/bindings/qmf/ruby/Makefile.am
 qpid/trunk/qpid/cpp/bindings/qmf2/python/Makefile.am
 qpid/trunk/qpid/cpp/bindings/qpid/python/Makefile.am
 qpid/trunk/qpid/cpp/bindings/qpid/ruby/Makefile.am

Modified: qpid/trunk/qpid/cpp/bindings/qmf/python/Makefile.am
URL: 
http://svn.apache.org/viewvc/qpid/trunk/qpid/cpp/bindings/qmf/python/Makefile.am?rev=1056420&r1=1056419&r2=1056420&view=diff
==
--- qpid/trunk/qpid/cpp/bindings/qmf/python/Makefile.am (original)
+++ qpid/trunk/qpid/cpp/bindings/qmf/python/Makefile.am Fri Jan  7 17:55:07 2011
@@ -27,9 +27,10 @@ generated_file_list = \

  EXTRA_DIST = python.i
  BUILT_SOURCES = $(generated_file_list)
+SWIG_FLAGS = -w362,401

  $(generated_file_list): $(srcdir)/python.i $(srcdir)/../qmfengine.i
-   swig -c++ -python -Wall $(INCLUDES) $(QPID_CXXFLAGS) 
-I$(top_srcdir)/src/qmf -I/usr/include -o qmfengine.cpp $(srcdir)/python.i
+   swig -c++ -python $(SWIG_FLAGS) $(INCLUDES) $(QPID_CXXFLAGS) 
-I$(top_srcdir)/src/qmf -I/usr/include -o qmfengine.cpp $(srcdir)/python.i

  pylibdir = $(PYTHON_LIB)


Modified: qpid/trunk/qpid/cpp/bindings/qmf/ruby/Makefile.am
URL: 
http://svn.apache.org/viewvc/qpid/trunk/qpid/cpp/bindings/qmf/ruby/Makefile.am?rev=1056420&r1=1056419&r2=1056420&view=diff
==
--- qpid/trunk/qpid/cpp/bindings/qmf/ruby/Makefile.am (original)
+++ qpid/trunk/qpid/cpp/bindings/qmf/ruby/Makefile.am Fri Jan  7 17:55:07 2011
@@ -23,13 +23,14 @@ INCLUDES = -I$(top_srcdir)/include -I$(t

  EXTRA_DIST = ruby.i
  BUILT_SOURCES = qmfengine.cpp
+SWIG_FLAGS = -w362,401

  rubylibdir = $(RUBY_LIB)

  dist_rubylib_DATA = qmf.rb

  qmfengine.cpp: $(srcdir)/ruby.i $(srcdir)/../qmfengine.i
-   $(SWIG) -ruby -c++ -Wall $(INCLUDES) $(QPID_CXXFLAGS) -I/usr/include -o 
qmfengine.cpp $(srcdir)/ruby.i
+   $(SWIG) -ruby -c++ $(SWIG_FLAGS) $(INCLUDES) $(QPID_CXXFLAGS) 
-I/usr/include -o qmfengine.cpp $(srcdir)/ruby.i

  rubylibarchdir = $(RUBY_LIB_ARCH)
  rubylibarch_LTLIBRARIES = qmfengine.la

Modified: qpid/trunk/qpid/cpp/bindings/qmf2/python/Makefile.am
URL: 
http://svn.apache.org/viewvc/qpid/trunk/qpid/cpp/bindings/qmf2/python/Makefile.am?rev=1056420&r1=1056419&r2=1056420&view=diff
==
--- qpid/trunk/qpid/cpp/bindings/qmf2/python/Makefile.am (original)
+++ qpid/trunk/qpid/cpp/bindings/qmf2/python/Makefile.am Fri Jan  7 17:55:07 
2011
@@ -35,6 +35,8 @@ $(generated_file_list): $(srcdir)/python
  pylibdir = $(PYTHON_LIB)

  lib_LTLIBRARIES = _cqmf2.la
+cqpiddir = $(pythondir)
+cqpid_PYTHON = qmf2.py cqmf2.py

  _cqmf2_la_LDFLAGS = -avoid-version -module -shared
  _cqmf2_la_LIBADD = $(PYTHON_LIBS) -L$(top_builddir)/src/.libs 
$(top_builddir)/src/libqmf2.la

Modified: qpid/trunk/qpid/cpp/bindings/qpid/python/Makefile.am
URL: 
http://svn.apache.org/viewvc/qpid/trunk/qpid/cpp/bindings/qpid/python/Makefile.am?rev=1056420&r1=1056419&r2=1056420&view=diff
==
--- qpid/trunk/qpid/cpp/bindings/qpid/python/Makefile.am (original)
+++ qpid/trunk/qpid/cpp/bindings/qpid/python/Makefile.am Fri Jan  7 17:55:07 
2011
@@ -27,16 +27,17 @@ generated_file_list = \

  EXTRA_DIST = python.i
  BUILT_SOURCES = $(generated_file_list)
+SWIG_FLAGS = -w362,401

  $(generated_file_list): $(srcdir)/python.i $(srcdir)/../qpid.i 
$(srcdir)/../../swig_python_typemaps.i
-   swig -c++ -python -Wall $(INCLUDES) $(QPID_CXXFLAGS) 
-I$(top_srcdir)/src/qmf -I/usr/include -o cqpid.cpp $(srcdir)/python.i
+   swig -c++ -python $(SWIG_FLAGS) $(INCLUDES) $(QPID_CXXFLAGS) 
-I$(top_srcdir)/src/qmf -I/usr/include -o cqpid.cpp $(srcdir)/python.i

  pylibdir = $(PYTHON_LIB)

  lib_LTLIBRARIES = _cqpid.la
+cqpiddir = $(pythondir)
+cqpid_PYTHON = cqpid.py

-#_cqpid_la_LDFLAGS = -avoid-version -module -shrext "$(PYTHON_SO)"
-#_cqpid_la_LDFLAGS = -avoid-version -module -shrext ".so"
  _cqpid_la_LDF

Re: Qpid Improvement Process

2011-01-10 Thread jross

Hi, Martin.

On Mon, 10 Jan 2011, Martin Ritchie wrote:


Hi Justin,

I very much like the QIP proposal my only concern is about its
potential for endless discussion. I understand that the QIPs have
timeframes but if discussion has not reached a satisfactory conclusion
there is an option to defer. Would we have rules on how often we could
defer a QIP before we had to stop and deal with the issue?


It's my hope that formalizing the proposal document will help to align the
discussion with our goals for upcoming releases.  I don't know if that 
will focus discussion, but I feel like it should at least help.


I'm having trouble imagining a reasonable set of rules for repeated 
deferment.  I do think that if we believe a QIP is complete but not a good 
fit for Qpid's direction, we should reject it, not defer it.  If something 
is a good fit, but we're not ready to accept it, then I don't see a reason 
to limit deferment.


For example, if we accepted a QIP to revamp HA in the next release, we 
might decline to accept another QIP that refactored federation, just 
because these are two far-reaching changes likely to collide.  In a 
scenario like this, I think we'd defer one or the other.



How are such difficulties addressed by the Gnome and Python projects?


That's a good question, and I don't have an answer.  I'll ask some of the 
folks involved in those projects.



The proposal suggests to me that a QIP must first be approved before
any code is written. I think that this would potentially be a bad move
as ideas often need prototypes to either prove out the idea or to get
community buy in. How would we address the need for a QIP to have some
initial development work done in svn?


Ah!  That means I need to be clearer about the intent.  I very much agree 
with you: prototypes or even nearly complete implementations are all to 
the good.  There's no need to wait on an implementation until after your 
QIP is accepted.


As to initial development work in svn, I think branches are the way to go. 
The new time-based release plan makes branches more important, and as you 
point out, so do QIPs.


Justin

-
Apache Qpid - AMQP Messaging Implementation
Project:  http://qpid.apache.org
Use/Interact: mailto:dev-subscr...@qpid.apache.org



[jira] Assigned: (QPID-2993) Federated source-local links crash remotely federated cluster member on local cluster startup

2011-01-10 Thread Alan Conway (JIRA)

 [ 
https://issues.apache.org/jira/browse/QPID-2993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Conway reassigned QPID-2993:
-

Assignee: Alan Conway

> Federated source-local links crash remotely federated cluster member on local 
> cluster startup
> -
>
> Key: QPID-2993
> URL: https://issues.apache.org/jira/browse/QPID-2993
> Project: Qpid
>  Issue Type: Bug
>  Components: C++ Broker, C++ Clustering
>Affects Versions: 0.8
> Environment: Debian Linux Squeeze, 32-bit, kernel 2.6.36.2, Dell 
> Poweredge 1950s. Corosync==1.3.0, Openais==1.1.4
>Reporter: Mark Moseley
>Assignee: Alan Conway
>
> This is related to JIRA 2992 that I opened, but this is for source-local 
> routes. Given the same setup as in JIRA 2992 but using source-local routes 
> (and obviously with the exchanges switched accordingly in the qpid-route 
> statements), i.e. cluster A and cluster B with the routes between A1<->B1, 
> when cluster B shuts down in the order B2->B1 and starts back up, the static 
> routes are not correctly re-bound on cluster A's side. However if cluster B 
> is shut down in the order B1->B2 and started back up, the route is correctly 
> created and works. However in the non-functioning case (B2->B1, or A2->A1), 
> there is an additional side-effect: on node A2, qpidd crashes with the 
> following error (cluster A is called 'walclust', B is bosclust):
> 2011-01-07 18:57:35 error Channel exception: not-attached: Channel 1 is not 
> attached (qpid/amqp_0_10/SessionHandler.cpp:39)
> 2011-01-07 18:57:35 critical cluster(102.0.0.0:13650 READY/error) local error 
> 2030 did not occur on member 101.0.0.0:9920: not-attached: Channel 1 is not 
> attached (qpid/amqp_0_10/SessionHandler.cpp:39)
> 2011-01-07 18:57:35 critical Error delivering frames: local error did not 
> occur on all cluster members : not-attached: Channel 1 is not attached 
> (qpid/amqp_0_10/SessionHandler.cpp:39) (qpid/cluster/ErrorCheck.cpp:89)
> 2011-01-07 18:57:35 notice cluster(102.0.0.0:13650 LEFT/error) leaving 
> cluster walclust
> 2011-01-07 18:57:35 notice Shut down
> This happens on both sides of the cluster, so it's not limited to one or the 
> other. This crash does *not* occur in the A1->A2/B1->B2 test (i.e. the test 
> where the route is re-bound correctly). I can cause this to reoccur pretty 
> much every time. I've been resetting the cluster completely to a new state 
> between each test. Occasionally in the B2->B1 test, A1 will also crash with 
> the same error (and vice versa for A2->A1 for node B1), though most of the 
> time, it's A2/B2 that crashes.
> I was getting this same behaviour prior to upgrading corosync/openais as 
> well. Previously I was using the stock Squeeze versions of corosync==1.2.1 
> and openais==1.1.2. The results are the same with corosync=1.3.0 and 
> openais==1.1.4.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
Apache Qpid - AMQP Messaging Implementation
Project:  http://qpid.apache.org
Use/Interact: mailto:dev-subscr...@qpid.apache.org



[jira] Assigned: (QPID-2992) Cluster failing to resurrect durable static route depending on order of shutdown

2011-01-10 Thread Alan Conway (JIRA)

 [ 
https://issues.apache.org/jira/browse/QPID-2992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Conway reassigned QPID-2992:
-

Assignee: Alan Conway

> Cluster failing to resurrect durable static route depending on order of 
> shutdown
> 
>
> Key: QPID-2992
> URL: https://issues.apache.org/jira/browse/QPID-2992
> Project: Qpid
>  Issue Type: Bug
>  Components: C++ Broker, C++ Clustering
>Affects Versions: 0.8
> Environment: Debian Linux Squeeze, 32-bit, kernel 2.6.36.2, Dell 
> Poweredge 1950s. Corosync==1.3.0, Openais==1.1.4
>Reporter: Mark Moseley
>Assignee: Alan Conway
>
> I've got a 2-node qpid test cluster at each of 2 datacenters, which are 
> federated together with a single durable static route between each. Qpid is 
> version 0.8. Corosync and openais are stock Squeeze (1.2.1-3 and 1.1.2-2, 
> respectively). OS is Squeeze, 32-bit, on Dell Poweredge 1950s, kernel 2.6.36. 
> The static route is durable and is set up over SSL (but I can replicate as 
> well with non-SSL). I've tried to normalize the hostnames below to make 
> things clearer; hopefully I didn't mess anything up.
> Given two clusters, cluster A (consisting of hosts A1 and A2) and cluster B 
> (with B1 and B2), I've got a static exchange route from A1 to B1, as well as 
> another from B1 to A1. Federation is working correctly, so I can send a 
> message on A2 and have it successfully retrieved on B2. The exchange local to 
> cluster A is walmyex1; the local exchange for B is bosmyex1.
> If I shut down the cluster in this order: B2, then B1, and start back up with 
> B1, B2, the static route route fails to get recreated. That is, on A1/A2, 
> looking at the bindings, exchange 'bosmyex1' does not get re-bound to cluster 
> B; the only output for it in "qpid-config exchanges --bindings" is just:
> 
> Exchange 'bosmyex1' (direct)
> 
> If however I shut the cluster down in this order: B1, then B2, and start B2, 
> then B1, the static route gets re-bound. The output then is:
> 
> Exchange 'bosmyex1' (direct)
> bind [unix.boston.cust] => 
> bridge_queue_1_8870523d-2286-408e-b5b5-50d53db2fa61
> 
> and I can message over the federated link with no further modification. Prior 
> to a few minutes ago, I was seeing this with the Squeeze stock openais==1.1.2 
> and corosync==1.2.1. In debugging this, I've upgraded both to the latest 
> versions with no change.
> I can replicate this every time I try. These are just test clusters, so I 
> don't have any other activity going on on them, or any other 
> exchanges/queues. My steps:
> On all boxes in cluster A and B:
> * Kill the qpidd if it's running and delete all existing store files, i.e. 
> contents of /var/lib/qpid/
> On host A1 in cluster A (I'm leaving out the -a user/t...@host stuff):
> * Start up qpid
> * qpid-config add exchange direct bosmyex1 --durable
> * qpid-config add exchange direct walmyex1 --durable
> * qpid-config add queue walmyq1 --durable
> * qpid-config bind walmyex1 walmyq1 unix.waltham.cust
> On host B1 in cluster B:
> * qpid-config add exchange direct bosmyex1 --durable
> * qpid-config add exchange direct walmyex1 --durable
> * qpid-config add queue bosmyq1 --durable
> * qpid-config bind bosmyex1 bosmyq1 unix.boston.cust
> On cluster A:
> * Start other member of cluster, A2
> * qpid-route route add amqps://user/p...@hosta1:5671 
> amqps://user/p...@hostb1:5671 walmyex1 unix.waltham.cust -d
> On cluster B:
> * Start other member of cluster, B2
> * qpid-route route add amqps://user/p...@hostb1:5671 
> amqps://user/p...@hosta1:5671 bosmyex1 unix.boston.cust -d
> On either cluster:
> * Check "qpid-config exchanges --bindings" to make sure bindings are correct 
> for remote exchanges
> * To see correct behaviour, stop cluster in the order B1->B2, or A1->A2, 
> start cluster back up, check bindings.
> * To see broken behaviour, stop cluster in the order B2->B1, or A2->A1, start 
> cluster back up, check bindings.
> This is a test cluster, so I'm free to do anything with it, debugging-wise, 
> that would be useful. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
Apache Qpid - AMQP Messaging Implementation
Project:  http://qpid.apache.org
Use/Interact: mailto:dev-subscr...@qpid.apache.org