On 28 July 2011 11:16, Greg Brown <gk_br...@verizon.net> wrote: >>> I think that making TaskGroup a Group was a mistake on my part. There is no >>> need to ensure uniqueness here. > > Actually, it just occurred to me that uniqueness, while not strictly > required, should certainly be encouraged. If you added the same task to a > ParallelTaskGroup's task sequence more than once, you'd basically get > serialized behavior, since only one thread could run the task at a time. So > perhaps a Group is actually appropriate.
I think that really depends on the intent again. I agree that in many cases the supplied Tasks would not contain any duplicates, but is there a strong reason to actually enforce it rather than just document the behaviour? Isn't the primary purpose of ParallelTaskGroup just to submit and wait? The documentation might simply state that it will *attempt* to run the supplied Tasks in parallel, but that it is limited because only one 'copy' of a duplicated Task can run at any time. I'd be happy enough with that, and the documentation can 'encourage' users not to supply duplicates. I can't see duplicates *breaking* anything, but can see them *changing* the observed behaviour. The *attempt* part is important as even non-duplicated Tasks might still be run in series if they were all created with a reference to the same 'SingleThreadExecutor' ExecutorService. This should also be documented for clarity. However if the intent is more to guarantee as far as possible that all submitted tasks will be started at the same time, then insisting on Group<Task> makes sense (would this be ever required?). But then so does the ability to control which ExecutorService is actually used to execute the tasks. The 'TaskGroup' could create its own unbounded ExecutorService (or used one supplied to it) to actually execute each of the tasks, overriding any ExecutorServices supplied to the Tasks' constructors. > Further, it occurred to me that TaskSequence (or SerialTaskGroup) is not > really all that useful, since it effectively executes the tasks > synchronously. The main value of a Task (or more generally, a thread) is > concurrency. Any old procedural code can run tasks in series. Yeah, that is why I questioned its value. Iterating over a Sequence and listening for some listeners is not overly complex. > Worse, TaskSequence executes sub-tasks by creating (or at least obtaining) a > new thread per task, which isn't really necessary. The same results could be > achieved by simply calling the synchronous version of execute() (the one that > doesn't take a listener argument) on each task in the list. This was part of the cause of my confusion with these classes. I didn't see why they needed to be Tasks themselves. It is straightforward to create and execute new Task if and when I want to, but I can't extract the executable portion of code from a Task once it is created. If the parallel execution code is wrapped in a Task, then that decision has been made for me. When I first read the original javadoc comment ... "Class that runs a group of tasks in parallel and notifies listeners when all tasks are complete." I thought it might be a fancy TaskListener implementation, or just a simple class that iterated over the collection of Tasks, launching them and waiting for completion using a listener of its own creation (which is what TaskGroup actually does do) but I didn't think it would extend Task. Do you remember if it was a conscious choice for these classes to extend Task from the outset, or do they do so because they use TaskListener for notifications? Making use of the TaskListener interface that is closely associated with Task is understandable, rather than creating an entirely new listener class to serve much the same purpose, but it wouldn't be out of the question to do so. If there was a dedicated TaskGroupListener interface, then it would break the connection to Task and there would be no need for these classes to extend Task in the first place. (Unless that is part of their design) As I mentioned before, I was thinking of TaskGroup as a 'ParallelExecutor'. Would it help to concentrate on the actual execution of the tasks and waiting for their completion in the 'TaskGroup'/'ParallelExecutor', rather than the implementation of a Collection too? (This is less relevant if you are considering deprecating TaskSequence, but serves to illustrate the idea). Then the functionality can be described as 'a class that submits Tasks for parallel execution and waits until they have all completed. The class could have a Collection<Task> property, or a static execute method that takes a Collection<Task> as a parameter. It would be up to the user to control what Tasks they submit for execution. The 'TaskGroup' / 'ParallelExecutor' would just be concerned with executing and waiting for those tasks to complete. Then it distills to more of a set of static utility methods public class TaskGroup { // Synchronous methods // Submit tasks for parallel execution (using their individual, internal ExecutorServices) and wait for completion public static void execute(final Collection<Task> tasks) {...} // Submit tasks to the provided ExecutorService for parallel execution and wait for completion // This requires a Task#setExecutorService(ExecutorService) method public static void execute(final Collection<Task> tasks, final ExecutorService executorService) {...} // Asynchronous methods // Cleaner (but less helpful) to make the user to wrap the previous methods in // their own Task if they want to run them asynchronously, rather than having // async methods here, but either is possible. // Creates and executes a parent Task using the 'parentTaskExecutorService' (or a default ExecutorService if null) // The parent Task will submit the supplied Collection of Tasks to the 'subTaskExecutorService' for parallel execution and wait for their completion. // If 'subTaskExecutorService' is null, the sub-Tasks will be executed with their internal ExecutorServices. // The TaskListener will be called when the parent Task completes. // See execute(Collection<Task>, ExecutorService) public static void execute(final TaskListener<V> parentTaskListener, final ExecutorService parentTaskExecutorService, final Collection<Task> subTasks, final ExecutorService subTaskExecutorService) {...} // More variations... } (These could even be named less generically and pushed into Task itself.) That would allow code like this to work as I had hoped/expected // Running synchronously, but could be wrapped in a Task if required // No more than 3 Tasks should be running at any time TaskGroup.execute(manyTasks, Executors.newFixedThreadPool(3)); // Running asynchronously using a Task created by the utility method // No more than 3 Tasks should be running at any time TaskGroup.execute(completionListener, null, manyTasks, Executors.newFixedThreadPool(3)); (I just typed all of this as it came to me, so I might well have overlooked some reason why this approach would not work) > So, I'm actually thinking that: > > a) TaskSequence should be deprecated/removed +1 > b) TaskGroup is actually an appropriate name :-) -0 :) If changes will be made as a result of this thread, then I think Task should have a setExecutorService(ExecutorService) method too, or its 'executorService' field made final if it is not supposed to be changed. My preference would be the former. Yesterday I changed TaskGroup's javadoc comment from "Class that runs a group of tasks in parallel and notifies listeners when all tasks are complete." to "Task that runs a group of tasks in parallel and notifies listeners when all tasks are complete." but it could go a little further again (and capitalize the class names) "Task that attempts to execute a Group of Tasks in parallel. This Task completes when all sub-Tasks have completed." And then include some detail about possible side effects from the ExecutorServices of the sub tasks. (Obviously this all depends on how the classes evolve.) Chris