Hi Todd,

Sorry for the clutter/confusion. Usually things aren't this cumbersome ;-)

In particular:
  1 committer is on vacation
  Mahadev's been out sick for multiple days
  I'm sick but trying to hang in there, but def not 100%

Hudson (CI) has been offline for effectively the past 3 weeks (that gates all our commits) and is just now back but flaky.

3.2 had some bugs that we are trying to address, but the afore mentioned issues are slowing us down. Otw we'd have all this straightened out by now ....

At this point you should move this discussion to the dev list - Apache doesn't really like us to discuss code changes/futures here (user list). On that list you'll also see the plan for upcoming releases - I mention all this because we are actively working toward 3.2.1 which will include the JIRAs slated for that release (I'm sure you've seen).

If you can wait a bit you might be able to avoid some pain by using the upcoming 3.2.1 release. Once the patches land into that branch your issues will be resolved w/o you needing to manually apply patches, etc...


I did look at the files you attached - it looks fine so I'm not sure the issue. The form of this test makes it harder - we are verifying that the log contains sufficient information when a particular error occurs. We fiddle with log4j in order to do this, which means that the log you are including doesn't specify the problem.

Try instrumenting this test with a try/catch around the content of the test method (all the code in the failing method inside a big try/catch is what I mean). Then print the error to std out as part of the catch. That should shed some light. If you could debug it a bit that would help - because we aren't seeing this in our environment.

Again, sort of a moot point if you can wait a week or so...

Regards,

Patrick

Todd Greenwood wrote:
Inline.

-----Original Message-----
From: Patrick Hunt [mailto:ph...@apache.org]
Sent: Thursday, July 30, 2009 10:57 PM
To: zookeeper-user@hadoop.apache.org
Subject: Re: test failures in branch-3.2

Todd Greenwood wrote:
Starting w/ branch-3.2 (no changes) I applied patches in this order:

1. Apply ZOOKEEPER-479.patch. Builds, but HierarchicalQuorumTest
fails.
2. Apply ZOOKEEPER-481.patch. Fails to build, b/c of missing file -
PortAssignment.java.

PortAssignment.java was added by Patrick as part of
ZOOKEEPER-473.patch,
which is a pretty hefty patch (> 2k lines) and touches a large
number of
files.
Hrm, those patches were probably created against the trunk. We'll have
to have separate patches for trunk and 3.2 branch on 481.

If you could update the jira with this detail (481 needs two patches,
one for each branch) that would be great!


Done.

3. Apply ZOOKEEPER-473.patch. Builds, but QuorumPeerMainTest fails
(jvm
crashes).
473 is "special" (unique) in the sense that it changes log4j while the
the vm is running. In general though it's a pretty boring test and
shouldn't be failing.

Are you sure you have the right patch file? there are 2 patch files on
the JIRA for 473, make sure that you have the one from 7/16, NOT the
one
from 7/15. Check that the patch file, the correct one should NOT
contain
changes to build.xml or conf/log4j* files. If this still happens send
me
your build.xml, conf/log4j* and QuroumPeerMainTest.java files in email
for review. I'll take a look.



I've annotated the files w/ their date while downloading:
112700 2009-07-31 11:02 ZOOKEEPER-473-7-15.patch
110607 2009-07-31 11:01 ZOOKEEPER-473-7-16.patch

It appears I applied the 7-16 patch, as that is the matching file size
of the patch file I applied.

If there are to be multiple patch files for multiple branches (3.2,
trunk, etc.) would it make sense to lable the patch files accordingly?

Requested files in attached tar.

-Todd

Patrick


[junit] Running
org.apache.zookeeper.server.quorum.QuorumPeerMainTest
    [junit] Running
org.apache.zookeeper.server.quorum.QuorumPeerMainTest
    [junit] Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0
sec
    [junit] Test
org.apache.zookeeper.server.quorum.QuorumPeerMainTest
FAILED (crashed)

------------
Test Log
------------
Testsuite: org.apache.zookeeper.server.quorum.QuorumPeerMainTest
Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0 sec

Testcase: testBadPeerAddressInQuorum took 0.004 sec
    Caused an ERROR
Forked Java VM exited abnormally. Please note the time in the report
does not reflect the time until the VM exit.
junit.framework.AssertionFailedError: Forked Java VM exited
abnormally.
Please note the time in the report does not reflect the time until
the
VM exit.

-Todd

-----Original Message-----
From: Patrick Hunt [mailto:ph...@apache.org]
Sent: Thursday, July 30, 2009 10:13 PM
To: zookeeper-user@hadoop.apache.org
Subject: Re: test failures in branch-3.2

Todd Greenwood wrote:
....
[Todd] Yes, I believe "address in use" was the problem w/ FLETest.
I
assumed it was a timing issue w/ respect to test A not fully
releasing
resources before test B started.
Might be, but actually I think it's related to this:
http://hea-www.harvard.edu/~fine/Tech/addrinuse.html

Patrick

Reply via email to