[jboss-user] [JBoss Messaging] - Re: strange failover behaviour in clustered config
http://jira.jboss.com/jira/browse/JBMESSAGING-1013 thanks View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=4063702#4063702 Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=4063702 ___ jboss-user mailing list jboss-user@lists.jboss.org https://lists.jboss.org/mailman/listinfo/jboss-user
[jboss-user] [JBoss Messaging] - Re: strange failover behaviour in clustered config
I would create jira task. Sysadmin are not always smart enougth to deal with difficult situation such as queue recovery :) They know how to stop/start backup/upgrade. It's looks something more for DBA but even so. Thanks again. Jira task will follow View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=4063696#4063696 Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=4063696 ___ jboss-user mailing list jboss-user@lists.jboss.org https://lists.jboss.org/mailman/listinfo/jboss-user
[jboss-user] [JBoss Messaging] - Re: strange failover behaviour in clustered config
After power failure I would suspect a sysadmin would access the machines anyway to start everything up? If you want, you can add a JIRA feature request for the manual merge queue feature via JMX. View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=4063661#4063661 Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=4063661 ___ jboss-user mailing list jboss-user@lists.jboss.org https://lists.jboss.org/mailman/listinfo/jboss-user
[jboss-user] [JBoss Messaging] - Re: strange failover behaviour in clustered config
I fully agree workaround you provide will work but with 1 constraint: you should have direct access to the machine to do that. With all security in production it not always possible. So it would be very nice if there is possibility to do merge manually. If it a call of function via jmx-console it's perfect. Thanks anyway for such fast response and explanation. View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=4063625#4063625 Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=4063625 ___ jboss-user mailing list jboss-user@lists.jboss.org https://lists.jboss.org/mailman/listinfo/jboss-user
[jboss-user] [JBoss Messaging] - Re: strange failover behaviour in clustered config
"timfox" wrote : "[EMAIL PROTECTED]" wrote : Just an idea: It would be possible to expose a method to mergeQueus from a dead node like they're describing, so a Human knowing the server will never come back could start the merge procedure. But I feel like this is way too dangerous and error prone! | | You woulldn't need to do that if you just started a new server with the same id on a different node - why is this such an issue? I said it would be possible.. but that's a bad idea... so I would stick with with start another node with the same ID (as I also said early on this thread). View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=4063611#4063611 Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=4063611 ___ jboss-user mailing list jboss-user@lists.jboss.org https://lists.jboss.org/mailman/listinfo/jboss-user
[jboss-user] [JBoss Messaging] - Re: strange failover behaviour in clustered config
"[EMAIL PROTECTED]" wrote : Just an idea: It would be possible to expose a method to mergeQueus from a dead node like they're describing, so a Human knowing the server will never come back could start the merge procedure. But I feel like this is way too dangerous and error prone! You woulldn't need to do that if you just started a new server with the same id on a different node - why is this such an issue? View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=4063577#4063577 Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=4063577 ___ jboss-user mailing list jboss-user@lists.jboss.org https://lists.jboss.org/mailman/listinfo/jboss-user
[jboss-user] [JBoss Messaging] - Re: strange failover behaviour in clustered config
Just an idea: It would be possible to expose a method to mergeQueus from a dead node like they're describing, so a Human knowing the server will never come back could start the merge procedure. But I feel like this is way too dangerous and error prone! View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=4063570#4063570 Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=4063570 ___ jboss-user mailing list jboss-user@lists.jboss.org https://lists.jboss.org/mailman/listinfo/jboss-user
[jboss-user] [JBoss Messaging] - Re: strange failover behaviour in clustered config
You should also consider that if power goes out on the entire cluster, then when it comes back on you have to start the servers again anyway. So, when you try and start the server on node A and it fails since the node is hosed, then you could just start the same server on a different node? It's exactly the same command line you'd be running, you'd just be running it on a different node. View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=4063487#4063487 Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=4063487 ___ jboss-user mailing list jboss-user@lists.jboss.org https://lists.jboss.org/mailman/listinfo/jboss-user
[jboss-user] [JBoss Messaging] - Re: strange failover behaviour in clustered config
Well, we could add such a switch, but it doesn't tackle the problem of all messages going to one node every time you do a normal startup. Also, going ahead, most people want to move away from the old style "JBoss MQ" model where you have a single shared database which all nodes use - since this turns into a performance bottleneck. For better scalability, we're going to support each node having its own file based persistent storage, or it's own database. (Of course we'll support the old model too) Typically all the file based stores would be persisted on some kind of SAN or shared file system with redundancy built in. If a node fails, another one takes over. When starting up from complete power failure, if a particular node doesn't start - e.g. the box is dead, then you could just start it on another node with the same server id, this could be done with a simple script. You could probably do something similar then you wouldn't have to worry about starting the node from the same crashed box. But having all messages on one node is not really a scalable solution. View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=4063479#4063479 Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=4063479 ___ jboss-user mailing list jboss-user@lists.jboss.org https://lists.jboss.org/mailman/listinfo/jboss-user
[jboss-user] [JBoss Messaging] - Re: strange failover behaviour in clustered config
Don't you think for High Available environment there should be an option to do it? We have very strict SLA's. Messages needs to be processed in appropriate amount of time, no matter what. With current JBossMQ (running as Singleton ) is not a problem. But with JBoss messaging we would need to right a lot of code to basically redistribute things manually somehow View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=4063465#4063465 Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=4063465 ___ jboss-user mailing list jboss-user@lists.jboss.org https://lists.jboss.org/mailman/listinfo/jboss-user
[jboss-user] [JBoss Messaging] - Re: strange failover behaviour in clustered config
"ramazanyich" wrote : but as on all nodes routerpolicy configured properly (using roundrobin) | then it should not be a problem if all messages are overtaken by first started node. | Then other nodes will come up JMS messages will be spreaded again correctly to other nodes. Or I'm not correct ? Well, yes, IF you have configured it this way, but not everyone wants redistribution. Secondly, this puts a big strain on the first node - the operation to merge queues in the database is fairly heavyweight, also it will have to load all these message on one node (memory issues), then all these messages have to be shifted off this node - which is very CPU and IO intensive. View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=4063457#4063457 Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=4063457 ___ jboss-user mailing list jboss-user@lists.jboss.org https://lists.jboss.org/mailman/listinfo/jboss-user
[jboss-user] [JBoss Messaging] - Re: strange failover behaviour in clustered config
but as on all nodes routerpolicy configured properly (using roundrobin) then it should not be a problem if all messages are overtaken by first started node. Then other nodes will come up JMS messages will be spreaded again correctly to other nodes. Or I'm not correct ? View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=4063455#4063455 Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=4063455 ___ jboss-user mailing list jboss-user@lists.jboss.org https://lists.jboss.org/mailman/listinfo/jboss-user
[jboss-user] [JBoss Messaging] - Re: strange failover behaviour in clustered config
If one node crashes and server side failover is enabled then the other node will take over the failed nodes messages. If both nodes crash at exactly the same time e.g. power goes out to both nodes simultaneously. When you turn the power back on you start up both nodes and you find one does not start - you want the other node to take over the messages from the node that does not start? The problem with this is how can the node that does start know that the other node really can't start or it's just that the sysadmin hasn't started it yet. E.g. if you had 10 nodes, each with their own messages, and they were all currently down. The sysadmin then starts them one by one. According to what you want, the first node would start, and then say "look... the other nodes aren't alive so I'm going to take over all their messages". Then the sysadmin starts the other nodes, and you'd end up with all the messages on the one node (the first node started) - which is not good. In most cases if the power went off, then it's more than likely the nodes will be startable after the power comes back on, and the sysadmin will just start them all - in this case we *don't* want nodes to take over other nodes message. I think we should cater for the most common case (i.e. the nodes *are* startable after failure), and leave the less common case to require manual intervention (your case where the node isn't startable after failover). If you can think of a way of automatically dealing with both cases then I am open to suggestion, although I can't think of one right now. Adding a flag in the database for each node won't help since it doesn't tell you if the node is not startable. View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=4063453#4063453 Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=4063453 ___ jboss-user mailing list jboss-user@lists.jboss.org https://lists.jboss.org/mailman/listinfo/jboss-user
[jboss-user] [JBoss Messaging] - Re: strange failover behaviour in clustered config
Very simple. For example we have 2 nodes in production. A lot of messages in the queue. Power goes done. 1 node is not starting up. How to detect if there are still messages for dead node? We have no direct access to database server. We somehow need to right an application for querying JMS tables or so? In quartz both nodes register themself in db. So if one is dead another one can detect it easily. View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=4063441#4063441 Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=4063441 ___ jboss-user mailing list jboss-user@lists.jboss.org https://lists.jboss.org/mailman/listinfo/jboss-user
[jboss-user] [JBoss Messaging] - Re: strange failover behaviour in clustered config
"sem" wrote : Sounds like potential problem because it would be difficult to monitor this things on production environment. Can you explain in more detail? I didn't quite understand. Thanks. View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=4063412#4063412 Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=4063412 ___ jboss-user mailing list jboss-user@lists.jboss.org https://lists.jboss.org/mailman/listinfo/jboss-user
[jboss-user] [JBoss Messaging] - Re: strange failover behaviour in clustered config
Sounds like potential problem because it would be difficult to monitor this things on production environment. Can you just have kind of heartbeat record in database about every node like Quartz does for example. In this case it's easy to detect dead nodes. Don't kill the messenger, just an idea. Thanks View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=4063337#4063337 Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=4063337 ___ jboss-user mailing list jboss-user@lists.jboss.org https://lists.jboss.org/mailman/listinfo/jboss-user
[jboss-user] [JBoss Messaging] - Re: strange failover behaviour in clustered config
If nodeA was the last node to crash, and is lost forever (it will never come back again) you could update the serverID on the database (or to have another node with the same ID as you described). View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=4063033#4063033 Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=4063033 ___ jboss-user mailing list jboss-user@lists.jboss.org https://lists.jboss.org/mailman/listinfo/jboss-user
[jboss-user] [JBoss Messaging] - Re: strange failover behaviour in clustered config
Thanks for explanation. Just some real case for clearance. Imagine node A is crashed completely (disk failure). It means that I will not be able to start it. Do I understand correctly that I have to install jboss messaging server on new machine and assign the same serverpeer id as it was on node A to be able process remaining message? View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=4063029#4063029 Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=4063029 ___ jboss-user mailing list jboss-user@lists.jboss.org https://lists.jboss.org/mailman/listinfo/jboss-user
[jboss-user] [JBoss Messaging] - Re: strange failover behaviour in clustered config
When you killed nodeA all your message were merged into nodeB. When you killed nodeB, you didn't have any nodes to accept the failover. Later you started nodeA.. nothing was merged from nodeA... When you started nodeB back... messages were loaded on the cluster again. I would say you aways need at least one server up on the cluster. For nodeA to assume messages prior to nodeB being loaded we would need to merge messages from nodeB (the way you're describing)... but I'm not sure if that's a good idea, as we have no control when nodeB would be loaded. (Immagine if nodeB is being loaded just few seconds after nodeA). View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=4063022#4063022 Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=4063022 ___ jboss-user mailing list jboss-user@lists.jboss.org https://lists.jboss.org/mailman/listinfo/jboss-user