I've diagnosed com/sun/jini/test/impl/mahalo/RandomStressTest.td, but may not have time to regression test and check in the fix.

The test failure is due to an error in com/sun/jini/mahalo/TxnManagerImpl.java. It throws a NullPointerException from abort if the transaction is not known to transaction management. That case can arise, even for a formerly valid transaction, due to it having already been committed or aborted, so it is timing dependent and more likely to happen under stress than when doing more controlled tests.

The TransactionManager interface specifies UnknownTransactionException for that case. The test catches UnknownTransactionException and treats it as a valid, non-error result. NullPointerException is not caught until it gets back to the TaskManager's thread run method, which turns it into a logged WARNING. The uncaught exception prevents the task from reporting completion back to the test, leading to the timeout and test failure.

There is an issue of whether there is anything else that depends on getting a RuntimeException rather than a checked exception, so it needs a full regression test.

I also recommend reducing the timeout from 30 minutes. When the test succeeds, it takes less than a minute.

Here is a patch to change the exception:

Index: src/com/sun/jini/mahalo/TxnManagerImpl.java
===================================================================
--- src/com/sun/jini/mahalo/TxnManagerImpl.java (revision 1000286)
+++ src/com/sun/jini/mahalo/TxnManagerImpl.java (working copy)
@@ -933,7 +933,7 @@
                }       
                
        } else {        
-           throw new NullPointerException("No such transaction ["+id+"]");
+           throw new UnknownTransactionException("No such transaction 
["+id+"]");
        }


Patricia


On 9/25/2010 7:17 AM, Jonathan Costers wrote:
I actually did come across a test that may be interesting to run for these
changes ...

com/sun/jini/test/impl/mahalo/RandomStressTest.td

I have not been able to get this test to pass on any of my machines or VMs
...
It seems to make heavy use of TaskManager and RetryTask.
It can take a while to finish as well, so beware.

You think this may be a good one?

you can run it if you change to the<river>/qa directory and execute:
ant -Drun.tests=com/sun/jini/test/impl/mahalo/RandomStressTest.td run-tests

or similarly in an IDE, set property
run.tests=com/sun/jini/test/impl/mahalo/RandomStressTest.td
and run the run-tests target

make sure everything is built first, of course.

2010/9/21 Patricia Shanahan<[email protected]>

I'm testing my new TaskManager the , but I have some anomalies. It would
help me to get some more testing of
https://svn.apache.org/repos/asf/incubator/river/jtsk/skunk/patsTaskManagerdone 
in other WindowsXP environments.

Both the head revision and revision  998737 need to be tested. Revision
998737 is the one I plan to merge into the trunk. It changes the interface
between TaskManager and its callers, with minimal changes to TaskManager.

It is important that it be tested widely, because it affects a lot of
critical classes, and would be difficult to back out.

The head revision drops in a revised TaskManager. It should be easy to back
out if necessary.

Patricia



Reply via email to