[jira] [Commented] (QPID-6213) qpidd misses heartbeats

2015-05-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/QPID-6213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14532893#comment-14532893
 ] 

ASF GitHub Bot commented on QPID-6213:
--

Github user ChugR closed the pull request at:

https://github.com/apache/qpid/pull/4


 qpidd misses heartbeats
 ---

 Key: QPID-6213
 URL: https://issues.apache.org/jira/browse/QPID-6213
 Project: Qpid
  Issue Type: Bug
  Components: C++ Broker
Affects Versions: 0.30
Reporter: Gordon Sim
Assignee: Gordon Sim
 Fix For: 0.32

 Attachments: 
 0001-QPID-6213-Fix-misuse-of-Timer-in-queue-cleaning-code.patch, 
 QPID-6213-svn-10.patch, QPID-6213_suggested_further_fix.patch, 
 qpid-6213-broker-1.log, qpid-6213-broker.log, qpid-6213-svn-01.patch, 
 qpid-6213-svn-14.patch, qpidd.log.gz


 Caused by https://issues.apache.org/jira/browse/QPID-5758. Reproducer from 
 Pavel Moravec: create many heartbeat enabled connections and queues (e.g. 500 
 idle receivers, each with their own queue) and have the purge interval 
 relatively short (to speed up reproducing).
 The broker misses heartbeats and connections get timed out.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Commented] (QPID-6213) qpidd misses heartbeats

2014-12-01 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/QPID-6213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14229783#comment-14229783
 ] 

ASF subversion and git services commented on QPID-6213:
---

Commit 1642681 from c...@apache.org in branch 'qpid/trunk'
[ https://svn.apache.org/r1642681 ]

QPID-6213: qpidd misses heartbeats

* Pollable queue breaks when client does not process whole batch.
* QueueCleaner must not reschedule same task multiple times.
* QueueCleaner breaks out of batch processing on wall clock time interval.

 qpidd misses heartbeats
 ---

 Key: QPID-6213
 URL: https://issues.apache.org/jira/browse/QPID-6213
 Project: Qpid
  Issue Type: Bug
  Components: C++ Broker
Affects Versions: 0.30
Reporter: Gordon Sim
Assignee: Gordon Sim
 Fix For: 0.31

 Attachments: 
 0001-QPID-6213-Fix-misuse-of-Timer-in-queue-cleaning-code.patch, 
 QPID-6213-svn-10.patch, QPID-6213_suggested_further_fix.patch, 
 qpid-6213-broker-1.log, qpid-6213-broker.log, qpid-6213-svn-01.patch, 
 qpid-6213-svn-14.patch, qpidd.log.gz


 Caused by https://issues.apache.org/jira/browse/QPID-5758. Reproducer from 
 Pavel Moravec: create many heartbeat enabled connections and queues (e.g. 500 
 idle receivers, each with their own queue) and have the purge interval 
 relatively short (to speed up reproducing).
 The broker misses heartbeats and connections get timed out.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Commented] (QPID-6213) qpidd misses heartbeats

2014-11-27 Thread Pavel Moravec (JIRA)

[ 
https://issues.apache.org/jira/browse/QPID-6213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14227458#comment-14227458
 ] 

Pavel Moravec commented on QPID-6213:
-

I have run some more stress tests (basically variants of the generic one) on 
upstream qpid patched by QPID-6213-svn-10.patch and no issue found. Kudos for 
the patch!

(I haven't reviewed the patch from technical/code point of view, just applied 
and run stress tests)

 qpidd misses heartbeats
 ---

 Key: QPID-6213
 URL: https://issues.apache.org/jira/browse/QPID-6213
 Project: Qpid
  Issue Type: Bug
  Components: C++ Broker
Affects Versions: 0.30
Reporter: Gordon Sim
Assignee: Gordon Sim
 Fix For: 0.31

 Attachments: 
 0001-QPID-6213-Fix-misuse-of-Timer-in-queue-cleaning-code.patch, 
 QPID-6213-svn-10.patch, QPID-6213_suggested_further_fix.patch, 
 qpid-6213-broker-1.log, qpid-6213-broker.log, qpid-6213-svn-01.patch, 
 qpid-6213-svn-14.patch, qpidd.log.gz


 Caused by https://issues.apache.org/jira/browse/QPID-5758. Reproducer from 
 Pavel Moravec: create many heartbeat enabled connections and queues (e.g. 500 
 idle receivers, each with their own queue) and have the purge interval 
 relatively short (to speed up reproducing).
 The broker misses heartbeats and connections get timed out.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Commented] (QPID-6213) qpidd misses heartbeats

2014-11-27 Thread Alan Conway (JIRA)

[ 
https://issues.apache.org/jira/browse/QPID-6213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14227824#comment-14227824
 ] 

Alan Conway commented on QPID-6213:
---

Re qpid-6213-svn-14.patch: very nice work. Please simplify by getting rid of 
the old while loop and making your new code the only way we process, i.e. 
delete old process(), rename processOneShot() as process() and get rid of 
boolean flag to choose between them.

The old loop is just wrong:
- clearly if the callback doesn't process the entire batch then we should give 
up the thread, it makes no sense to stuff messages the callback has just 
refused back into the same callback in the same thread.
- it is possible that new messages arrive while we are processing the first 
batch, but again we should give up the thread before we process them - 
otherwise we can hold the thread for an unbounded time if messages keep 
arriving while we are processing.

The whole idea of batch processing in this code was exactly to do a bounded 
amount of work before giving up the thread, so the while loop was some sort of 
brain fart on my part.

 qpidd misses heartbeats
 ---

 Key: QPID-6213
 URL: https://issues.apache.org/jira/browse/QPID-6213
 Project: Qpid
  Issue Type: Bug
  Components: C++ Broker
Affects Versions: 0.30
Reporter: Gordon Sim
Assignee: Gordon Sim
 Fix For: 0.31

 Attachments: 
 0001-QPID-6213-Fix-misuse-of-Timer-in-queue-cleaning-code.patch, 
 QPID-6213-svn-10.patch, QPID-6213_suggested_further_fix.patch, 
 qpid-6213-broker-1.log, qpid-6213-broker.log, qpid-6213-svn-01.patch, 
 qpid-6213-svn-14.patch, qpidd.log.gz


 Caused by https://issues.apache.org/jira/browse/QPID-5758. Reproducer from 
 Pavel Moravec: create many heartbeat enabled connections and queues (e.g. 500 
 idle receivers, each with their own queue) and have the purge interval 
 relatively short (to speed up reproducing).
 The broker misses heartbeats and connections get timed out.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Commented] (QPID-6213) qpidd misses heartbeats

2014-11-25 Thread Gordon Sim (JIRA)

[ 
https://issues.apache.org/jira/browse/QPID-6213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14224356#comment-14224356
 ] 

Gordon Sim commented on QPID-6213:
--

Comment added to pull request. Looks good in general, just a couple of 
'personal preferences' on style.

 qpidd misses heartbeats
 ---

 Key: QPID-6213
 URL: https://issues.apache.org/jira/browse/QPID-6213
 Project: Qpid
  Issue Type: Bug
  Components: C++ Broker
Affects Versions: 0.30
Reporter: Gordon Sim
Assignee: Gordon Sim
 Fix For: 0.31

 Attachments: qpid-6213-svn-01.patch


 Caused by https://issues.apache.org/jira/browse/QPID-5758. Reproducer from 
 Pavel Moravec: create many heartbeat enabled connections and queues (e.g. 500 
 idle receivers, each with their own queue) and have the purge interval 
 relatively short (to speed up reproducing).
 The broker misses heartbeats and connections get timed out.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Commented] (QPID-6213) qpidd misses heartbeats

2014-11-25 Thread Gordon Sim (JIRA)

[ 
https://issues.apache.org/jira/browse/QPID-6213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14224624#comment-14224624
 ] 

Gordon Sim commented on QPID-6213:
--

From the log it looks like the problem you are seeing is caused by delayed 
timer tasks for the heartbeats. That suggests that the timer is getting messed 
up internally. A common way for that to happen - and the cause the first fix 
addressed - is for the same task to be added more than once (in which case the 
fire time may won't match the order in all cases). Though I can't yet see the 
path for which that would now be happening, it might be worth some extra 
checking to see if we can spot anything.

 qpidd misses heartbeats
 ---

 Key: QPID-6213
 URL: https://issues.apache.org/jira/browse/QPID-6213
 Project: Qpid
  Issue Type: Bug
  Components: C++ Broker
Affects Versions: 0.30
Reporter: Gordon Sim
Assignee: Gordon Sim
 Fix For: 0.31

 Attachments: qpid-6213-svn-01.patch, qpidd.log.gz


 Caused by https://issues.apache.org/jira/browse/QPID-5758. Reproducer from 
 Pavel Moravec: create many heartbeat enabled connections and queues (e.g. 500 
 idle receivers, each with their own queue) and have the purge interval 
 relatively short (to speed up reproducing).
 The broker misses heartbeats and connections get timed out.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Commented] (QPID-6213) qpidd misses heartbeats

2014-11-25 Thread Gordon Sim (JIRA)

[ 
https://issues.apache.org/jira/browse/QPID-6213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14225059#comment-14225059
 ] 

Gordon Sim commented on QPID-6213:
--

Re 'Is it really the case that purging the queues can take longer than the 10 
minutes between purges?' - most likely not, but the purge interval is 
configurable so it may in some cases be much less than 10 minutes.

 qpidd misses heartbeats
 ---

 Key: QPID-6213
 URL: https://issues.apache.org/jira/browse/QPID-6213
 Project: Qpid
  Issue Type: Bug
  Components: C++ Broker
Affects Versions: 0.30
Reporter: Gordon Sim
Assignee: Gordon Sim
 Fix For: 0.31

 Attachments: 
 0001-QPID-6213-Fix-misuse-of-Timer-in-queue-cleaning-code.patch, 
 QPID-6213_suggested_further_fix.patch, QPID-6213_suggested_further_fix.patch, 
 qpid-6213-svn-01.patch, qpidd.log.gz


 Caused by https://issues.apache.org/jira/browse/QPID-5758. Reproducer from 
 Pavel Moravec: create many heartbeat enabled connections and queues (e.g. 500 
 idle receivers, each with their own queue) and have the purge interval 
 relatively short (to speed up reproducing).
 The broker misses heartbeats and connections get timed out.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Commented] (QPID-6213) qpidd misses heartbeats

2014-11-25 Thread Gordon Sim (JIRA)

[ 
https://issues.apache.org/jira/browse/QPID-6213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14225079#comment-14225079
 ] 

Gordon Sim commented on QPID-6213:
--

Yes, the use of restart() only in the purge() and re-adding the task only in 
fired() seems cleaner.

 qpidd misses heartbeats
 ---

 Key: QPID-6213
 URL: https://issues.apache.org/jira/browse/QPID-6213
 Project: Qpid
  Issue Type: Bug
  Components: C++ Broker
Affects Versions: 0.30
Reporter: Gordon Sim
Assignee: Gordon Sim
 Fix For: 0.31

 Attachments: 
 0001-QPID-6213-Fix-misuse-of-Timer-in-queue-cleaning-code.patch, 
 QPID-6213_suggested_further_fix.patch, qpid-6213-svn-01.patch, qpidd.log.gz


 Caused by https://issues.apache.org/jira/browse/QPID-5758. Reproducer from 
 Pavel Moravec: create many heartbeat enabled connections and queues (e.g. 500 
 idle receivers, each with their own queue) and have the purge interval 
 relatively short (to speed up reproducing).
 The broker misses heartbeats and connections get timed out.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Commented] (QPID-6213) qpidd misses heartbeats

2014-11-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/QPID-6213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14223566#comment-14223566
 ] 

ASF GitHub Bot commented on QPID-6213:
--

GitHub user ChugR opened a pull request:

https://github.com/apache/qpid/pull/4

QPID-6213: fix queue cleaner connection timeouts

Here's a patch that solves a windows issue and possibly is a general 
solution the base problem.

1. On windows the queue cleaner runs before there are any queues. This 
causes an empty list of queues to get posted by the task which exits without 
rescheduling itself. Then the pollable queue never fires because nothing ever 
got added. Finally the original task never gets rescheduled and the process is 
deadlocked. The solution is to add a null pointer to pollable queue in the 
event that the pollable queue appears empty. This makes sure that the task on 
the other side of the pollable queue runs. Note that this null may get added 
even though a batch is already in flight. That will not break anything.

2. This patch adds a timeout, currently one second, which when exceeded 
gets the purge function to reschedule it's pending work and exit. This gives 
the other i/o tasks a shot at running before the queue cleaning starts again. 
In my testing I had 2000 queues and in its current form the queue cleaner will 
clean all 2000 before releasing the thread regardless of the batch size 
presented to the fire function. With this patch the 2000 queues are processed 
but in chunks that are called by the pollable queue.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ChugR/qpid trunk

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/qpid/pull/4.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4


commit 25218c048393af2781737adf6457addbe971cb75
Author: Charles E. Rolke c...@apache.org
Date:   2014-11-24T21:35:09Z

QPID-6213: fix queue cleaner connection timeouts




 qpidd misses heartbeats
 ---

 Key: QPID-6213
 URL: https://issues.apache.org/jira/browse/QPID-6213
 Project: Qpid
  Issue Type: Bug
  Components: C++ Broker
Affects Versions: 0.30
Reporter: Gordon Sim
Assignee: Gordon Sim
 Fix For: 0.31

 Attachments: qpid-6213-svn-01.patch


 Caused by https://issues.apache.org/jira/browse/QPID-5758. Reproducer from 
 Pavel Moravec: create many heartbeat enabled connections and queues (e.g. 500 
 idle receivers, each with their own queue) and have the purge interval 
 relatively short (to speed up reproducing).
 The broker misses heartbeats and connections get timed out.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Commented] (QPID-6213) qpidd misses heartbeats

2014-11-05 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/QPID-6213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14198142#comment-14198142
 ] 

ASF subversion and git services commented on QPID-6213:
---

Commit 1636848 from [~gsim] in branch 'qpid/trunk'
[ https://svn.apache.org/r1636848 ]

QPID-6213: only restart timer once all queues have been purged

 qpidd misses heartbeats
 ---

 Key: QPID-6213
 URL: https://issues.apache.org/jira/browse/QPID-6213
 Project: Qpid
  Issue Type: Bug
  Components: C++ Broker
Affects Versions: 0.30
Reporter: Gordon Sim
Assignee: Gordon Sim
 Fix For: 0.31


 Caused by https://issues.apache.org/jira/browse/QPID-5758. Reproducer from 
 Pavel Moravec: create many heartbeat enabled connections and queues (e.g. 500 
 idle receivers, each with their own queue) and have the purge interval 
 relatively short (to speed up reproducing).
 The broker misses heartbeats and connections get timed out.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org