On 4/01/2014 3:18 PM, Greg Trasuk wrote:
I’ll also point out Patricia’s recent statement that TaskManager should be 
reasonably efficient for small task queues, but less efficient for larger task 
queues.  We don’t have solid evidence that the task queues ever get large.  
Hence, the assertion that “TaskManager doesn’t scale” is meaningless.

No, it's not about scalability, it's about the window of time when a task is removed from the queue in TaskManager for execution but fails and needs to be retried later. Task.runAfter doesn't contain the task that "should have executed" so dependant tasks proceed before their depenencies.

This code comment from ServiceDiscoveryManager might help:

       /** This task class, when executed, first registers to receive
* ServiceEvents from the given ServiceRegistrar. If the registration
         *  process succeeds (no RemoteExceptions), it then executes the
         *  LookupTask to query the given ServiceRegistrar for a "snapshot"
         *  of its current state with respect to services that match the
         *  given template.
         *
         *  Note that the order of execution of the two tasks is important.
* That is, the LookupTask must be executed only after registration * for events has completed. This is because when an entity registers
         *  with the event mechanism of a ServiceRegistrar, the entity will
         *  only receive notification of events that occur "in the future",
* after the registration is made. The entity will not receive events * about changes to the state of the ServiceRegistrar that may have
         *  occurred before or during the registration process.
         *
* Thus, if the order of these tasks were reversed and the LookupTask
         *  were to be executed prior to the RegisterListenerTask, then the
         *  possibility exists for the occurrence of a change in the
* ServiceRegistrar's state between the time the LookupTask retrieves
         *  a snapshot of that state, and the time the event registration
         *  process has completed, resulting in an incorrect view of the
         *  current state of the ServiceRegistrar.

  If real usage never requires a large task queue, then scalability isn’t an 
issue, and we don’t know whether it ever needs a large task queue.

In any case, removing TaskManager and replacing it with hard-coded 
ThreadPoolExecutors moves us farther away from having the capability of a 
shared work queue.
   So I’m not in favour of this change.  I haven’t looked at the other services 
or utility classes, but if the changes are similar, I’m also not in favour.

No TaskManager instances have been replaced by ExecutorService, which is set via configuration. The hard coded part is how to order tasks through the configuration provided ExecutorService.

One option is to stop worrying about event order at the sender, and figure out a way of ordering at the recepient.



   You’re introducing changes that introduce test failures (which is why you’re 
asking for help) without a good reason.

The reason is to expose synchronization bugs so they can be observed and fixed.

  You’re never going to ship this code unless you stop modifying it.

It is a considerable undertaking, but I'm not in any hurry, it's not yet ready for release. I understand you're concerned about the considerable number of changes; there will be plenty of time for compatibility testing.

Also, when you say below,
I'm developing an ExecutorService wrapper that retry's failed tasks in 
org.apache.river.impl.thread.SerialExecutorService, by not removing a task from 
it's queue until it completes successfully, it prevents any dependant tasks 
from running, I would like to use this as a replacement for TaskManager and 
RetryTask.
…be careful!  You’re getting into the same difficult area as transactional 
semantics around messaging.  Will you need to provide a “dead task” queue?  Do 
you need to set a limit on how many times a task get retried?  What happens 
when that limit is exceeded?  Do all tasks have the same limit?  Should a task 
get notified when it’s exceeded the retry limit?  How long should you wait 
between retries?  Is that number the same for all tasks.  Is there some kind of 
alarm or notification when tasks end up being retried, or when the dead task 
queue becomes full?

Since most dependencies appear to be based on who the reciepient is, if that recepient is not contactable in spite of considerable effort, we should abandon futher attempts to do so. At present RetryTask will continue to attempt to make contact every 5 minutes.

Regards,

Peter.

Sometimes it’s best not to try to abstract-away all complexity.

Greg.

On Jan 3, 2014, at 10:43 PM, Peter Firmstone<j...@zeus.net.au>  wrote:

ServiceDiscoveryManager is now the only class that utilises TaskManager and 
RetryTask.  JoinManager still uses TaskManager but not RetryTask.  See 
River-344 for an explanation of the problem.

Most instances of TaskManager in qa-refactor have been replaced with 
ExecutorService, RetryTask now implements RunnableFuture and can be cancelled 
by Future.cancel from the ExecutorService.

I'm developing an ExecutorService wrapper that retry's failed tasks in 
org.apache.river.impl.thread.SerialExecutorService, by not removing a task from 
it's queue until it completes successfully, it prevents any dependant tasks 
from running, I would like to use this as a replacement for TaskManager and 
RetryTask.

Can anyone spare time to review, suggest alternatives, or improvements?

Thanks in advance,

Peter.

Failed com_sun_jini_test_impl_servicediscovery_event_DiscardDownReDiscover.td
Test Failed: com.sun.jini.qa.harness.TestException: discovery failed -- waited 
30 seconds (0 minutes) -- 2 discovery event(s) expected, 1 discovery event(s) 
received

Failed  com_sun_jini_test_impl_servicediscovery_event_DiscardServiceDown.td
Test Failed: com.sun.jini.qa.harness.TestException: discovery failed -- waited 
30 seconds (0 minutes) -- 2 discovery event(s) expected, 0 discovery event(s) 
received

Failed  com_sun_jini_test_impl_servicediscovery_event_DiscardServiceUp.td
Test Failed: com.sun.jini.qa.harness.TestException: discovery failed -- waited 
30 seconds (0 minutes) -- 2 discovery event(s) expected, 0 discovery event(s) 
received

Failed  com_sun_jini_test_impl_servicediscovery_event_LookupTaskRace.td
Test Failed: com.sun.jini.qa.harness.TestException: discovery failed -- waited 
30 seconds (0 minutes) -- 2 discovery event(s) expected, 0 discovery event(s) 
received

Failed  com_sun_jini_test_impl_servicediscovery_event_ReRegisterBadEquals.td
Test Failed: com.sun.jini.qa.harness.TestException: discovery failed -- waited 
30 seconds (0 minutes) -- 4 discovery event(s) expected, 0 discovery event(s) 
received

Failed com_sun_jini_test_impl_servicediscovery_event_ReRegisterGoodEquals.td
Test Failed: com.sun.jini.qa.harness.TestException: discovery failed -- waited 
30 seconds (0 minutes) -- 4 discovery event(s) expected, 0 discovery event(s) 
received

Failed 
com_sun_jini_test_impl_servicediscovery_event_ServiceDiscardCacheTerminate.td
Test Failed: com.sun.jini.qa.harness.TestException: discovery failed -- waited 
30 seconds (0 minutes) -- 4 discovery event(s) expected, 0 discovery event(s) 
received

Failed  com_sun_jini_test_spec_servicediscovery_cache_CacheDiscard.td
Test Failed: com.sun.jini.qa.harness.TestException: discovery failed -- waited 
30 seconds (0 minutes) -- 2 discovery event(s) expected, 0 discovery event(s) 
received

Failed  com_sun_jini_test_spec_servicediscovery_cache_CacheLookup.td
Test Failed: com.sun.jini.qa.harness.TestException: discovery failed -- waited 
30 seconds (0 minutes) -- 2 discovery event(s) expected, 0 discovery event(s) 
received

Failed  com_sun_jini_test_spec_servicediscovery_lookup_Lookup.td
Test Failed: com.sun.jini.qa.harness.TestException: discovery failed -- waited 
30 seconds (0 minutes) -- 2 discovery event(s) expected, 0 discovery event(s) 
received

Failed  com_sun_jini_test_spec_servicediscovery_lookup_LookupMax.td
Test Failed: com.sun.jini.qa.harness.TestException: discovery failed -- waited 
30 seconds (0 minutes) -- 3 discovery event(s) expected, 2 discovery event(s) 
received

Failed  com_sun_jini_test_spec_servicediscovery_lookup_LookupMaxFilter.td
Test Failed: com.sun.jini.qa.harness.TestException: discovery failed -- waited 
30 seconds (0 minutes) -- 3 discovery event(s) expected, 0 discovery event(s) 
received

Failed  com_sun_jini_test_spec_servicediscovery_lookup_LookupMinEqualsMax.td
Test Failed: com.sun.jini.qa.harness.TestException: discovery failed -- waited 
30 seconds (0 minutes) -- 3 discovery event(s) expected, 0 discovery event(s) 
received

Failed 
com_sun_jini_test_spec_servicediscovery_lookup_LookupMinMaxNoBlockFilter.td
Test Failed: com.sun.jini.qa.harness.TestException: discovery failed -- waited 
30 seconds (0 minutes) -- 3 discovery event(s) expected, 0 discovery event(s) 
received

Failed  com_sun_jini_test_spec_servicediscovery_lookup_LookupWait.td
Test Failed: com.sun.jini.qa.harness.TestException: discovery failed -- waited 
30 seconds (0 minutes) -- 2 discovery event(s) expected, 0 discovery event(s) 
received

Failed  com_sun_jini_test_spec_servicediscovery_lookup_LookupWaitFilter.td
Test Failed: com.sun.jini.qa.harness.TestException: discovery failed -- waited 
30 seconds (0 minutes) -- 2 discovery event(s) expected, 1 discovery event(s) 
received

Failed  com_sun_jini_test_spec_servicediscovery_lookup_LookupWaitNoBlock.td
Test Failed: com.sun.jini.qa.harness.TestException: discovery failed -- waited 
30 seconds (0 minutes) -- 2 discovery event(s) expected, 0 discovery event(s) 
received



On 4/01/2014 10:27 AM, Apache Jenkins Server wrote:
See<https://builds.apache.org/job/river-qa-refactor-win/45/>

------------------------------------------
[...truncated 15733 lines...]
      [java] Test Passed: OK
      [java]
      [java] -----------------------------------------
      [java] 
com/sun/jini/test/spec/javaspace/conformance/snapshot/SnapshotTransactionTakeIfExistsTest.td
      [java] Test Passed: OK
      [java]
      [java] -----------------------------------------
      [java] 
com/sun/jini/test/spec/javaspace/conformance/snapshot/SnapshotTransactionTakeIfExistsWaitTest.td
      [java] Test Passed: OK
      [java]
      [java] -----------------------------------------
      [java] 
com/sun/jini/test/spec/javaspace/conformance/snapshot/SnapshotTransactionTakeNO_WAITTest.td
      [java] Test Passed: OK
      [java]
      [java] -----------------------------------------
      [java] 
com/sun/jini/test/spec/javaspace/conformance/snapshot/SnapshotTransactionTakeReadTest.td
      [java] Test Passed: OK
      [java]
      [java] -----------------------------------------
      [java] 
com/sun/jini/test/spec/javaspace/conformance/snapshot/SnapshotTransactionTakeTest.td
      [java] Test Passed: OK
      [java]
      [java] -----------------------------------------
      [java] 
com/sun/jini/test/spec/javaspace/conformance/snapshot/SnapshotTransactionTakeWaitTest.td
      [java] Test Passed: OK
      [java]
      [java] -----------------------------------------
      [java] 
com/sun/jini/test/spec/javaspace/conformance/snapshot/SnapshotTransactionWriteLeaseANYTest.td
      [java] Test Passed: OK
      [java]
      [java] -----------------------------------------
      [java] 
com/sun/jini/test/spec/javaspace/conformance/snapshot/SnapshotTransactionWriteLeaseFOREVERTest.td
      [java] Test Passed: OK
      [java]
      [java] -----------------------------------------
      [java] 
com/sun/jini/test/spec/javaspace/conformance/snapshot/SnapshotTransactionWriteNegativeLeaseTest.td
      [java] Test Passed: OK
      [java]
      [java] -----------------------------------------
      [java] 
com/sun/jini/test/spec/javaspace/conformance/snapshot/SnapshotTransactionWriteTakeIfExistsNotifyTest.td
      [java] Test Passed: OK
      [java]
      [java] -----------------------------------------
      [java] 
com/sun/jini/test/spec/javaspace/conformance/snapshot/SnapshotTransactionWriteTakeIfExistsTest.td
      [java] Test Passed: OK
      [java]
      [java] -----------------------------------------
      [java] 
com/sun/jini/test/spec/javaspace/conformance/snapshot/SnapshotTransactionWriteTakeNotifyTest.td
      [java] Test Passed: OK
      [java]
      [java] -----------------------------------------
      [java] 
com/sun/jini/test/spec/javaspace/conformance/snapshot/SnapshotTransactionWriteTakeTest.td
      [java] Test Passed: OK
      [java]
      [java] -----------------------------------------
      [java] 
com/sun/jini/test/spec/javaspace/conformance/snapshot/SnapshotTransactionWriteTest.td
      [java] Test Passed: OK
      [java]
      [java] -----------------------------------------
      [java] 
com/sun/jini/test/spec/javaspace/conformance/snapshot/SnapshotWriteLeaseANYTest.td
      [java] Test Passed: OK
      [java]
      [java] -----------------------------------------
      [java] 
com/sun/jini/test/spec/javaspace/conformance/snapshot/SnapshotWriteLeaseFOREVERTest.td
      [java] Test Passed: OK
      [java]
      [java] -----------------------------------------
      [java] 
com/sun/jini/test/spec/javaspace/conformance/snapshot/SnapshotWriteNegativeLeaseTest.td
      [java] Test Passed: OK
      [java]
      [java] -----------------------------------------
      [java] 
com/sun/jini/test/spec/javaspace/conformance/snapshot/SnapshotWriteTest.td
      [java] Test Passed: OK
      [java]
      [java] -----------------------------------------
      [java] com/sun/jini/test/impl/mahalo/AdminIFShutdownTest.td
      [java] Test Passed: OK
      [java]
      [java] -----------------------------------------
      [java] com/sun/jini/test/impl/mahalo/AdminIFTest.td
      [java] Test Passed: OK
      [java]
      [java] -----------------------------------------
      [java] com/sun/jini/test/impl/mahalo/LeaseExpireCancelTest.td
      [java] Test Passed: OK
      [java]
      [java] -----------------------------------------
      [java] com/sun/jini/test/impl/mahalo/LeaseExpireRenewTest.td
      [java] Test Passed: OK
      [java]
      [java] -----------------------------------------
      [java] com/sun/jini/test/impl/mahalo/LeaseMapTest.td
      [java] Test Passed: OK
      [java]
      [java] -----------------------------------------
      [java] com/sun/jini/test/impl/mahalo/LeaseTest.td
      [java] Test Passed: OK
      [java]
      [java] -----------------------------------------
      [java] com/sun/jini/test/impl/mahalo/MahaloCreateShutdownTest.td
      [java] Test Passed: OK
      [java]
      [java] -----------------------------------------
      [java] com/sun/jini/test/impl/mahalo/MahaloIFTest.td
      [java] Test Passed: OK
      [java]
      [java] -----------------------------------------
      [java] com/sun/jini/test/impl/mahalo/MahaloImplReadyStateTest.td
      [java] Test Skipped: verifiers are: 
com.sun.jini.test.impl.mercury.ActivatableMercuryVerifier 
com.sun.jini.qa.harness.SkipConfigTestVerifier
      [java] -----------------------------------------
      [java] 
com/sun/jini/test/impl/mahalo/NestableServerTransactionCreatedToStringTest.td
      [java] Test Passed: OK
      [java]
      [java] -----------------------------------------
      [java] 
com/sun/jini/test/impl/mahalo/NestableTransactionCreatedToStringTest.td
      [java] Test Passed: OK
      [java]
      [java] -----------------------------------------
      [java] com/sun/jini/test/impl/mahalo/PrepareAndCommitExceptionTest.td
      [java] Test Passed: OK
      [java]
      [java] -----------------------------------------
      [java] com/sun/jini/test/impl/mahalo/PrepareAndCommitExceptionTest2.td
      [java] Test Passed: OK
      [java]
      [java] -----------------------------------------
      [java] com/sun/jini/test/impl/mahalo/PrepareAndCommitExceptionTest3.td
      [java] Test Passed: OK
      [java]
      [java] -----------------------------------------
      [java] com/sun/jini/test/impl/mahalo/PrepareAndCommitExceptionTest4.td
      [java] Test Passed: OK
      [java]
      [java] -----------------------------------------
      [java] com/sun/jini/test/impl/mahalo/PrepareAndCommitExceptionTest5.td
      [java] Test Passed: OK
      [java]
      [java] -----------------------------------------
      [java] com/sun/jini/test/impl/mahalo/RandomStressTest.td
      [java] Test Passed: OK
      [java]
      [java] -----------------------------------------
      [java] com/sun/jini/test/impl/mahalo/ServerTransactionEqualityTest.td
      [java] Test Passed: OK
      [java]
      [java] -----------------------------------------
      [java] com/sun/jini/test/impl/mahalo/ServerTransactionToStringTest.td
      [java] Test Passed: OK
      [java]
      [java] -----------------------------------------
      [java] com/sun/jini/test/impl/mahalo/TransactionCreatedToStringTest.td
      [java] Test Passed: OK
      [java]
      [java] -----------------------------------------
      [java] 
com/sun/jini/test/impl/mahalo/TransactionManagerCreatedToStringTest.td
      [java] Test Passed: OK
      [java]
      [java] -----------------------------------------
      [java] 
com/sun/jini/test/impl/mahalo/TxnMgrImplNullActivationConfigEntries.td
      [java] Test Skipped: verifiers are: 
com.sun.jini.test.impl.mahalo.ActivatableMahaloVerifier
      [java] -----------------------------------------
      [java] com/sun/jini/test/impl/mahalo/TxnMgrImplNullConfigEntries.td
      [java] Test Passed: OK
      [java]
      [java] -----------------------------------------
      [java] com/sun/jini/test/impl/mahalo/TxnMgrImplNullRecoveredLocators.td
      [java] Test Skipped: verifiers are: 
com.sun.jini.test.impl.mahalo.ActivatableMahaloVerifier
      [java] -----------------------------------------
      [java] com/sun/jini/test/impl/mahalo/TxnMgrProxyEqualityTest.td
      [java] Test Passed: OK
      [java]
      [java] -----------------------------------------
      [java] com/sun/jini/test/spec/txnmanager/AsynchAbortOnCommitTest.td
      [java] Test Passed: OK
      [java]
      [java] -----------------------------------------
      [java] com/sun/jini/test/spec/txnmanager/AsynchAbortOnPrepareTest.td
      [java] Test Passed: OK
      [java]
      [java] -----------------------------------------
      [java] com/sun/jini/test/spec/txnmanager/CommitExpiredTest.td
      [java] Test Passed: OK
      [java]
      [java] -----------------------------------------
      [java] com/sun/jini/test/spec/txnmanager/CommitTimeoutTest.td
      [java] Test Passed: OK
      [java]
      [java] -----------------------------------------
      [java] com/sun/jini/test/spec/txnmanager/GetStateTest.td
      [java] Test Passed: OK
      [java]
      [java] -----------------------------------------
      [java] com/sun/jini/test/spec/txnmanager/JoinIdempotentTest.td
      [java] Test Passed: OK
      [java]
      [java] -----------------------------------------
      [java] com/sun/jini/test/spec/txnmanager/JoinWhileActiveTest.td
      [java] Test Passed: OK
      [java]
      [java] -----------------------------------------
      [java] com/sun/jini/test/spec/txnmanager/ManyParticipantsTest.td
      [java] Test Passed: OK
      [java]
      [java] -----------------------------------------
      [java] com/sun/jini/test/spec/txnmanager/PrepareTimeoutTest.td
      [java] Test Passed: OK
      [java]
      [java] -----------------------------------------
      [java] com/sun/jini/test/spec/txnmanager/RollBackErrorTest.td
      [java] Test Passed: OK
      [java]
      [java] -----------------------------------------
      [java] com/sun/jini/test/spec/txnmanager/RollForwardErrorTest.td
      [java] Test Passed: OK
      [java]
      [java] -----------------------------------------
      [java] com/sun/jini/test/spec/txnmanager/TwoPhaseTest.td
      [java] Test Passed: OK
      [java]
      [java] -----------------------------------------
      [java]
      [java] # of tests started   = 1406
      [java] # of tests completed = 1406
      [java] # of tests skipped   = 52
      [java] # of tests passed    = 1388
      [java] # of tests failed    = 18
      [java]
      [java] -----------------------------------------
      [java]
      [java]    Date finished:
      [java]       Fri Jan 03 16:27:03 PST 2014
      [java]    Time elapsed:
      [java]       59325 seconds
      [java]
      [java] Java Result: 1

collect-result:
      [copy] Copying 1 file 
to<https://builds.apache.org/job/river-qa-refactor-win/ws/trunk\qa\result>
      [copy] Copying 1 file 
to<https://builds.apache.org/job/river-qa-refactor-win/ws/trunk\qa\result>
       [zip] Building 
zip:<https://builds.apache.org/job/river-qa-refactor-win/ws/trunk\qa\result\qaresults-amd64-Windows>
   Server 2008 R2-1.7.0.zip

BUILD FAILED
<https://builds.apache.org/job/river-qa-refactor-win/ws/trunk\build.xml>:2109: 
The following error occurred while executing this line:
<https://builds.apache.org/job/river-qa-refactor-win/ws/trunk\qa\build.xml>:406:
 The following error occurred while executing this line:
<https://builds.apache.org/job/river-qa-refactor-win/ws/trunk\qa\build.xml>:380:
 condition satisfied

Total time: 996 minutes 9 seconds
Build step 'Invoke Ant' marked build as failure
Archiving artifacts

Reply via email to