Hi, Marcel Reutegger schrieb: > Hi, > > 2009/7/12 Jukka Zitting <jukka.zitt...@gmail.com>: >> Hi, >> >> 2009/7/8 Marcel Reutegger <marcel.reuteg...@gmx.net>: >>> - paralleled execution of some work. this is primarily to make use of >>> multi-core processors. execution should be distributed over and >>> executed by N threads which is a factor of the available processors. >> If I recall correctly we debated this already earlier. My point was >> that limiting the number of tasks to the number of available >> processors may not be a good approach as the tasks may be IO-bound or >> block for other reasons, in which case having more task threads would >> give you better throughput. But I recall being proven wrong, did we >> have some benchmark for that? Do you remember where this discussion >> was? > > I don't remember either... But let's just start a new one. > > I think this very much depends on the work that needs to be distributed. there > is no prove that one way is better than the other. for CPU intensive work we'd > probably want to limit the number of concurrent tasks. for I/O intensive work > the concurrency should be higher. > > my above point was rather related to CPU intensive work. e.g. creating a > posting > list while content is indexed. but of course there might be other work that > may > be parallelized more aggressively. > > I guess the actual pool shouldn't care about that. some utility on top > of the pool > should provide that functionality. i.e. execute a number of tasks with a given > level of concurrency. the utility would then dispatch the tasks to the pool > accordingly. > >>> - Timers used in TransactionContext and MultiIndex. This could be >>> turned into a scheduling mechanism that could also be used by the >>> ClusterNode sync. Other classes that use periodic checks in a >>> background thread: DatabaseJournal (ClusterRevisionJanitor), >>> CooperativeFileLock (watch dog). >> Yep. Perhaps we could also reuse some of the scheduling functionality in >> Sling. > > I'm not sure this is needed. the java rt library already comes with > Timer and Task > classes. our needs are very simple and I'm not sure that justifies a > new dependency.
Yes, AFAICT Java also has ThreadPool implementations. If not, I urge to still _not_ reinvent the wheel and take something existing even if it would a single dependency. Regards Felix > >>> the more I think about it, the more I like your idea. but we should be >>> careful with a maximum size for a repository wide pool. extensive use >>> of the pool by a module should not lock up another module just because >>> there are no more idle threads. maybe that global pool shouldn't have >>> a maximum size... >> That might make sense. Perhaps we should have some concept of >> sub-pools (that borrow from the main pool) with fixed limits for tasks >> that need them (see above). > > hmm, that doesn't sound flexible and generic. I just thought again how cool > it was if we could deploy jackrabbit into a google app-engine. that however > requires that all background threads are removed. if we have that generic > pool and client code adjusted accordingly it could be as easy as turning > the pool into a direct executor variant ;) well, that's very optimistic but > sounds promising to me... > > regards > marcel >