Re: cluster_read_credit

2010-05-04 Thread Alan Conway

On 05/04/2010 07:47 AM, Ján Sáreník wrote:

Hi Alan!

On Mon, May 03, 2010 at 10:33:34AM -0400, Alan Conway wrote:

Please create a JIRA with this info and assign it to me, I'll look into it.


https://issues.apache.org/jira/browse/QPID-2532 assigned.



I'm not able to reproduce this with r939184 on RHEL5. Do you have a machine 
where you can reproduce it easily? I'd like to check the cluster*.log files and 
core file if any. The symptoms suggest that one of the brokers crashed.


-
Apache Qpid - AMQP Messaging Implementation
Project:  http://qpid.apache.org
Use/Interact: mailto:dev-subscr...@qpid.apache.org



Re: cluster_read_credit

2010-05-03 Thread Alan Conway

On 04/29/2010 02:44 PM, Ján Sáreník wrote:

Hi again!

On Thu, Apr 29, 2010 at 07:56:02PM +0200, Ján Sáreník wrote:

There are only Valgrind errors that I see and Gordon says they are
merely SASL-related. Should I send a bug-report to cyrus-sasl to make
it clean or should those error be added as exceptions? I would like to
spread the word about qpid and to encourage people from community to
compile it and give it a try. If some of them have valgrind installed
they could be surprised that the tests fail...


What OS  versions (incl. cyrus-sasl version) are you seeing this on? We should 
check if these are really cyrus bugs, if so ignore them, or if qpid is not doing 
its part to clean up some memory allocated by cyrus.



After more 'make check's even with VALGRIND= to disable it,
I got few different errors:

  1. https://issues.apache.org/jira/browse/QPID-2552 reported earlier
 (just to prove it is still valid)

  2. cluster_read_credit fails randomly as well
 I thought I worked this error around by making sure there
 are no leftover pidfiles in ~/.qpidd or /tmp and no leftover
 daemons running, but it does not help actually. It says:

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
PASS: run_cluster_test
2010-04-29 20:27:17 warning Connection closed
2010-04-29 20:27:17 warning Connect failed: Connection refused
2010-04-29 20:27:17 warning Connection closed
Failed: Cannot establish a connection
2010-04-29 20:27:17 critical Unexpected error: Removing stale lock file 
/tmp/qpidd.46939.pid
Errors stopping brokers on ports:  46939
FAIL: cluster_read_credit
PASS: test_watchdog
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Alan, you have a log entry for cluster_read_credit file, do you
have an idea how to fix it?



Please create a JIRA with this info and assign it to me, I'll look into it.

-
Apache Qpid - AMQP Messaging Implementation
Project:  http://qpid.apache.org
Use/Interact: mailto:dev-subscr...@qpid.apache.org



Re: cluster_read_credit

2010-05-03 Thread Ján Sáreník
Hi Alan!

On Mon, May 03, 2010 at 10:33:34AM -0400, Alan Conway wrote:
 On Thu, Apr 29, 2010 at 07:56:02PM +0200, Ján Sáreník wrote:
 There are only Valgrind errors that I see and Gordon says they are
 merely SASL-related. Should I send a bug-report to cyrus-sasl to make
 it clean or should those error be added as exceptions? I would like to
 spread the word about qpid and to encourage people from community to
 compile it and give it a try. If some of them have valgrind installed
 they could be surprised that the tests fail...
 
 What OS  versions (incl. cyrus-sasl version) are you seeing this
 on? We should check if these are really cyrus bugs, if so ignore
 them, or if qpid is not doing its part to clean up some memory
 allocated by cyrus.

These Valgrind errors can be seen as part of log attached to
mentioned https://issues.apache.org/jira/browse/QPID-2552

It is on Fedora Rawhide x86_64
cyrus-sasl-2.1.23-12.fc14.x86_64 (in use since 12th Apr)
boost-1.41.0-8.fc14.x86_64
valgrind-3.5.0-16.fc14.x86_64

I understand this is pretty bleeding-edge, but working hard to find
bugs which can have some connection to exact-vacuum-RHEL5 build-environment
is IMHO worthy so I can give feed-back to people with recent (and even
non-RHEL or Fedora) distributions.

And I want to make sure the tests run in such environments as well because
I believe there will be better progress if more people run 'make check'
on current qpid source, whatever distro it is on... you know there are
projects which just compile.


 SNIP
 Alan, you have a log entry for cluster_read_credit file, do you
 have an idea how to fix it?
 
 
 Please create a JIRA with this info and assign it to me, I'll look into it.

Will do that soon.

I hope this kind of testing which I do will help the overall qpid
stability and usability. If you think otherwise, please give me
feedback as I was just sending compile logs (which I was kindly
asked to send at the end of build error output) so far.

If it helps, I can set up an automated build every night (our CEST TZ)
which will send out (or copy somewhere) the logs in case it exits
with error...

HTH
  Best regards, Ján

-
Apache Qpid - AMQP Messaging Implementation
Project:  http://qpid.apache.org
Use/Interact: mailto:dev-subscr...@qpid.apache.org



cluster_read_credit (was: qpid-r939184-jasan make check)

2010-04-29 Thread Ján Sáreník
Hi again!

On Thu, Apr 29, 2010 at 07:56:02PM +0200, Ján Sáreník wrote:
 There are only Valgrind errors that I see and Gordon says they are
 merely SASL-related. Should I send a bug-report to cyrus-sasl to make
 it clean or should those error be added as exceptions? I would like to
 spread the word about qpid and to encourage people from community to
 compile it and give it a try. If some of them have valgrind installed
 they could be surprised that the tests fail...

After more 'make check's even with VALGRIND= to disable it,
I got few different errors:

 1. https://issues.apache.org/jira/browse/QPID-2552 reported earlier
(just to prove it is still valid)

 2. cluster_read_credit fails randomly as well
I thought I worked this error around by making sure there
are no leftover pidfiles in ~/.qpidd or /tmp and no leftover
daemons running, but it does not help actually. It says:

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
PASS: run_cluster_test
2010-04-29 20:27:17 warning Connection closed
2010-04-29 20:27:17 warning Connect failed: Connection refused
2010-04-29 20:27:17 warning Connection closed
Failed: Cannot establish a connection
2010-04-29 20:27:17 critical Unexpected error: Removing stale lock file 
/tmp/qpidd.46939.pid
Errors stopping brokers on ports:  46939
FAIL: cluster_read_credit
PASS: test_watchdog
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Alan, you have a log entry for cluster_read_credit file, do you
have an idea how to fix it?

   Thanks, Ján
-- 
Red Hat Czech, MRG Quality Engineering

-
Apache Qpid - AMQP Messaging Implementation
Project:  http://qpid.apache.org
Use/Interact: mailto:dev-subscr...@qpid.apache.org