[jira] [Commented] (QPID-6213) qpidd misses heartbeats
[ https://issues.apache.org/jira/browse/QPID-6213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14532893#comment-14532893 ] ASF GitHub Bot commented on QPID-6213: -- Github user ChugR closed the pull request at: https://github.com/apache/qpid/pull/4 qpidd misses heartbeats --- Key: QPID-6213 URL: https://issues.apache.org/jira/browse/QPID-6213 Project: Qpid Issue Type: Bug Components: C++ Broker Affects Versions: 0.30 Reporter: Gordon Sim Assignee: Gordon Sim Fix For: 0.32 Attachments: 0001-QPID-6213-Fix-misuse-of-Timer-in-queue-cleaning-code.patch, QPID-6213-svn-10.patch, QPID-6213_suggested_further_fix.patch, qpid-6213-broker-1.log, qpid-6213-broker.log, qpid-6213-svn-01.patch, qpid-6213-svn-14.patch, qpidd.log.gz Caused by https://issues.apache.org/jira/browse/QPID-5758. Reproducer from Pavel Moravec: create many heartbeat enabled connections and queues (e.g. 500 idle receivers, each with their own queue) and have the purge interval relatively short (to speed up reproducing). The broker misses heartbeats and connections get timed out. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org For additional commands, e-mail: dev-h...@qpid.apache.org
[jira] [Commented] (QPID-6213) qpidd misses heartbeats
[ https://issues.apache.org/jira/browse/QPID-6213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14229783#comment-14229783 ] ASF subversion and git services commented on QPID-6213: --- Commit 1642681 from c...@apache.org in branch 'qpid/trunk' [ https://svn.apache.org/r1642681 ] QPID-6213: qpidd misses heartbeats * Pollable queue breaks when client does not process whole batch. * QueueCleaner must not reschedule same task multiple times. * QueueCleaner breaks out of batch processing on wall clock time interval. qpidd misses heartbeats --- Key: QPID-6213 URL: https://issues.apache.org/jira/browse/QPID-6213 Project: Qpid Issue Type: Bug Components: C++ Broker Affects Versions: 0.30 Reporter: Gordon Sim Assignee: Gordon Sim Fix For: 0.31 Attachments: 0001-QPID-6213-Fix-misuse-of-Timer-in-queue-cleaning-code.patch, QPID-6213-svn-10.patch, QPID-6213_suggested_further_fix.patch, qpid-6213-broker-1.log, qpid-6213-broker.log, qpid-6213-svn-01.patch, qpid-6213-svn-14.patch, qpidd.log.gz Caused by https://issues.apache.org/jira/browse/QPID-5758. Reproducer from Pavel Moravec: create many heartbeat enabled connections and queues (e.g. 500 idle receivers, each with their own queue) and have the purge interval relatively short (to speed up reproducing). The broker misses heartbeats and connections get timed out. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org For additional commands, e-mail: dev-h...@qpid.apache.org
[jira] [Commented] (QPID-6213) qpidd misses heartbeats
[ https://issues.apache.org/jira/browse/QPID-6213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14227458#comment-14227458 ] Pavel Moravec commented on QPID-6213: - I have run some more stress tests (basically variants of the generic one) on upstream qpid patched by QPID-6213-svn-10.patch and no issue found. Kudos for the patch! (I haven't reviewed the patch from technical/code point of view, just applied and run stress tests) qpidd misses heartbeats --- Key: QPID-6213 URL: https://issues.apache.org/jira/browse/QPID-6213 Project: Qpid Issue Type: Bug Components: C++ Broker Affects Versions: 0.30 Reporter: Gordon Sim Assignee: Gordon Sim Fix For: 0.31 Attachments: 0001-QPID-6213-Fix-misuse-of-Timer-in-queue-cleaning-code.patch, QPID-6213-svn-10.patch, QPID-6213_suggested_further_fix.patch, qpid-6213-broker-1.log, qpid-6213-broker.log, qpid-6213-svn-01.patch, qpid-6213-svn-14.patch, qpidd.log.gz Caused by https://issues.apache.org/jira/browse/QPID-5758. Reproducer from Pavel Moravec: create many heartbeat enabled connections and queues (e.g. 500 idle receivers, each with their own queue) and have the purge interval relatively short (to speed up reproducing). The broker misses heartbeats and connections get timed out. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org For additional commands, e-mail: dev-h...@qpid.apache.org
[jira] [Commented] (QPID-6213) qpidd misses heartbeats
[ https://issues.apache.org/jira/browse/QPID-6213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14227824#comment-14227824 ] Alan Conway commented on QPID-6213: --- Re qpid-6213-svn-14.patch: very nice work. Please simplify by getting rid of the old while loop and making your new code the only way we process, i.e. delete old process(), rename processOneShot() as process() and get rid of boolean flag to choose between them. The old loop is just wrong: - clearly if the callback doesn't process the entire batch then we should give up the thread, it makes no sense to stuff messages the callback has just refused back into the same callback in the same thread. - it is possible that new messages arrive while we are processing the first batch, but again we should give up the thread before we process them - otherwise we can hold the thread for an unbounded time if messages keep arriving while we are processing. The whole idea of batch processing in this code was exactly to do a bounded amount of work before giving up the thread, so the while loop was some sort of brain fart on my part. qpidd misses heartbeats --- Key: QPID-6213 URL: https://issues.apache.org/jira/browse/QPID-6213 Project: Qpid Issue Type: Bug Components: C++ Broker Affects Versions: 0.30 Reporter: Gordon Sim Assignee: Gordon Sim Fix For: 0.31 Attachments: 0001-QPID-6213-Fix-misuse-of-Timer-in-queue-cleaning-code.patch, QPID-6213-svn-10.patch, QPID-6213_suggested_further_fix.patch, qpid-6213-broker-1.log, qpid-6213-broker.log, qpid-6213-svn-01.patch, qpid-6213-svn-14.patch, qpidd.log.gz Caused by https://issues.apache.org/jira/browse/QPID-5758. Reproducer from Pavel Moravec: create many heartbeat enabled connections and queues (e.g. 500 idle receivers, each with their own queue) and have the purge interval relatively short (to speed up reproducing). The broker misses heartbeats and connections get timed out. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org For additional commands, e-mail: dev-h...@qpid.apache.org
[jira] [Commented] (QPID-6213) qpidd misses heartbeats
[ https://issues.apache.org/jira/browse/QPID-6213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14224356#comment-14224356 ] Gordon Sim commented on QPID-6213: -- Comment added to pull request. Looks good in general, just a couple of 'personal preferences' on style. qpidd misses heartbeats --- Key: QPID-6213 URL: https://issues.apache.org/jira/browse/QPID-6213 Project: Qpid Issue Type: Bug Components: C++ Broker Affects Versions: 0.30 Reporter: Gordon Sim Assignee: Gordon Sim Fix For: 0.31 Attachments: qpid-6213-svn-01.patch Caused by https://issues.apache.org/jira/browse/QPID-5758. Reproducer from Pavel Moravec: create many heartbeat enabled connections and queues (e.g. 500 idle receivers, each with their own queue) and have the purge interval relatively short (to speed up reproducing). The broker misses heartbeats and connections get timed out. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org For additional commands, e-mail: dev-h...@qpid.apache.org
[jira] [Commented] (QPID-6213) qpidd misses heartbeats
[ https://issues.apache.org/jira/browse/QPID-6213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14224624#comment-14224624 ] Gordon Sim commented on QPID-6213: -- From the log it looks like the problem you are seeing is caused by delayed timer tasks for the heartbeats. That suggests that the timer is getting messed up internally. A common way for that to happen - and the cause the first fix addressed - is for the same task to be added more than once (in which case the fire time may won't match the order in all cases). Though I can't yet see the path for which that would now be happening, it might be worth some extra checking to see if we can spot anything. qpidd misses heartbeats --- Key: QPID-6213 URL: https://issues.apache.org/jira/browse/QPID-6213 Project: Qpid Issue Type: Bug Components: C++ Broker Affects Versions: 0.30 Reporter: Gordon Sim Assignee: Gordon Sim Fix For: 0.31 Attachments: qpid-6213-svn-01.patch, qpidd.log.gz Caused by https://issues.apache.org/jira/browse/QPID-5758. Reproducer from Pavel Moravec: create many heartbeat enabled connections and queues (e.g. 500 idle receivers, each with their own queue) and have the purge interval relatively short (to speed up reproducing). The broker misses heartbeats and connections get timed out. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org For additional commands, e-mail: dev-h...@qpid.apache.org
[jira] [Commented] (QPID-6213) qpidd misses heartbeats
[ https://issues.apache.org/jira/browse/QPID-6213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14225059#comment-14225059 ] Gordon Sim commented on QPID-6213: -- Re 'Is it really the case that purging the queues can take longer than the 10 minutes between purges?' - most likely not, but the purge interval is configurable so it may in some cases be much less than 10 minutes. qpidd misses heartbeats --- Key: QPID-6213 URL: https://issues.apache.org/jira/browse/QPID-6213 Project: Qpid Issue Type: Bug Components: C++ Broker Affects Versions: 0.30 Reporter: Gordon Sim Assignee: Gordon Sim Fix For: 0.31 Attachments: 0001-QPID-6213-Fix-misuse-of-Timer-in-queue-cleaning-code.patch, QPID-6213_suggested_further_fix.patch, QPID-6213_suggested_further_fix.patch, qpid-6213-svn-01.patch, qpidd.log.gz Caused by https://issues.apache.org/jira/browse/QPID-5758. Reproducer from Pavel Moravec: create many heartbeat enabled connections and queues (e.g. 500 idle receivers, each with their own queue) and have the purge interval relatively short (to speed up reproducing). The broker misses heartbeats and connections get timed out. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org For additional commands, e-mail: dev-h...@qpid.apache.org
[jira] [Commented] (QPID-6213) qpidd misses heartbeats
[ https://issues.apache.org/jira/browse/QPID-6213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14225079#comment-14225079 ] Gordon Sim commented on QPID-6213: -- Yes, the use of restart() only in the purge() and re-adding the task only in fired() seems cleaner. qpidd misses heartbeats --- Key: QPID-6213 URL: https://issues.apache.org/jira/browse/QPID-6213 Project: Qpid Issue Type: Bug Components: C++ Broker Affects Versions: 0.30 Reporter: Gordon Sim Assignee: Gordon Sim Fix For: 0.31 Attachments: 0001-QPID-6213-Fix-misuse-of-Timer-in-queue-cleaning-code.patch, QPID-6213_suggested_further_fix.patch, qpid-6213-svn-01.patch, qpidd.log.gz Caused by https://issues.apache.org/jira/browse/QPID-5758. Reproducer from Pavel Moravec: create many heartbeat enabled connections and queues (e.g. 500 idle receivers, each with their own queue) and have the purge interval relatively short (to speed up reproducing). The broker misses heartbeats and connections get timed out. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org For additional commands, e-mail: dev-h...@qpid.apache.org
[jira] [Commented] (QPID-6213) qpidd misses heartbeats
[ https://issues.apache.org/jira/browse/QPID-6213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14223566#comment-14223566 ] ASF GitHub Bot commented on QPID-6213: -- GitHub user ChugR opened a pull request: https://github.com/apache/qpid/pull/4 QPID-6213: fix queue cleaner connection timeouts Here's a patch that solves a windows issue and possibly is a general solution the base problem. 1. On windows the queue cleaner runs before there are any queues. This causes an empty list of queues to get posted by the task which exits without rescheduling itself. Then the pollable queue never fires because nothing ever got added. Finally the original task never gets rescheduled and the process is deadlocked. The solution is to add a null pointer to pollable queue in the event that the pollable queue appears empty. This makes sure that the task on the other side of the pollable queue runs. Note that this null may get added even though a batch is already in flight. That will not break anything. 2. This patch adds a timeout, currently one second, which when exceeded gets the purge function to reschedule it's pending work and exit. This gives the other i/o tasks a shot at running before the queue cleaning starts again. In my testing I had 2000 queues and in its current form the queue cleaner will clean all 2000 before releasing the thread regardless of the batch size presented to the fire function. With this patch the 2000 queues are processed but in chunks that are called by the pollable queue. You can merge this pull request into a Git repository by running: $ git pull https://github.com/ChugR/qpid trunk Alternatively you can review and apply these changes as the patch at: https://github.com/apache/qpid/pull/4.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #4 commit 25218c048393af2781737adf6457addbe971cb75 Author: Charles E. Rolke c...@apache.org Date: 2014-11-24T21:35:09Z QPID-6213: fix queue cleaner connection timeouts qpidd misses heartbeats --- Key: QPID-6213 URL: https://issues.apache.org/jira/browse/QPID-6213 Project: Qpid Issue Type: Bug Components: C++ Broker Affects Versions: 0.30 Reporter: Gordon Sim Assignee: Gordon Sim Fix For: 0.31 Attachments: qpid-6213-svn-01.patch Caused by https://issues.apache.org/jira/browse/QPID-5758. Reproducer from Pavel Moravec: create many heartbeat enabled connections and queues (e.g. 500 idle receivers, each with their own queue) and have the purge interval relatively short (to speed up reproducing). The broker misses heartbeats and connections get timed out. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org For additional commands, e-mail: dev-h...@qpid.apache.org
[jira] [Commented] (QPID-6213) qpidd misses heartbeats
[ https://issues.apache.org/jira/browse/QPID-6213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14198142#comment-14198142 ] ASF subversion and git services commented on QPID-6213: --- Commit 1636848 from [~gsim] in branch 'qpid/trunk' [ https://svn.apache.org/r1636848 ] QPID-6213: only restart timer once all queues have been purged qpidd misses heartbeats --- Key: QPID-6213 URL: https://issues.apache.org/jira/browse/QPID-6213 Project: Qpid Issue Type: Bug Components: C++ Broker Affects Versions: 0.30 Reporter: Gordon Sim Assignee: Gordon Sim Fix For: 0.31 Caused by https://issues.apache.org/jira/browse/QPID-5758. Reproducer from Pavel Moravec: create many heartbeat enabled connections and queues (e.g. 500 idle receivers, each with their own queue) and have the purge interval relatively short (to speed up reproducing). The broker misses heartbeats and connections get timed out. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org For additional commands, e-mail: dev-h...@qpid.apache.org