On Thu, Sep 11, 2008 at 12:44 AM, Jim Weirich <[EMAIL PROTECTED]> wrote: > On Sep 4, 2008, at 2:33 AM, James M. Lawrence wrote: > > Hi James, > > Saw your announcement on ruby-talk and want to say good job on getting drake > out. Now that 0.8.2 is released, I've taken some time to look at some of > what you've done. It looks impressive. Thanks.
> (1) Are you using Ruby threads or processes for the parallelism? Threads. CompTree (http://comptree.rubyforge.org) has an option to fork nodes, but I haven't enabled it for Drake. Since I expect -j to be commonly used for compiling, forking would be redundant anyway. Yet it's ready to go with the options [-k, --fork] in rake.rb, just commented out. Especially for the first release I didn't see a compelling need to go there yet. A single option -j felt nicer. > (2) We should think about the sematics of the the command "rake -j2 a b" > Are "a" and "b" executed in parallel or sequentially. It looks like the > code base goes with sequentially, and I think this is the right choice. But > it may be worth a discussion. Yes, I intentionally decided not to put a and b under the same parent node. On the command line we especially think sequentially. > (3) I see a lot of the files are marked "GENERATED -- DO NOT EDIT". > Generated from what? Will I be able to regenerate them if they need > changing? Would it be better to just use CompTree as a gem? The master files are under contrib/comp_tree. rake pull_contrib gets the latest comp_tree from github then repackages it under the Rake::CompTree in lib/rake/comp_tree. I was never happy with this, btw. I wanted to Rake to contain a fork of CompTree while at the same time not conflicting with a regular installation of CompTree. I experimented with dynamically evaling it into Rake::CompTree, but my method failed to act transitively to the sub-sub module Quix (utilities). The other reason was that I thought having external dependencies might give you pause about merging into the mainline. But if you are OK with the dependency, we could use the gem. Also, at the time I thought the CompTree API might change enough to cause hassles being an external dependency. But since versioning seems to work OK in rubygems, it's kind of a moot point. And CompTree may be stable enough after all. > (4) As far as I can tell, when running with num_threads > 1, you invoke all > the tasks and gather the task arguments. Then you pass the task dependency > graph off to the CompTree code to execute the code in parallel. So all the > code execution actually happens after ALL the invokes are done on the code, > rather than intermingled in standard rake. Is my understanding correct. > (if so, very interesting ... I'm thinking that if it wasn't for the need > for the task arguments, you could skip the invoke step and pass the > dependency graph immediately to your CompTree package, yes?) Yes, excellent detective work. I wrested with getting the task arguments right on my own and finally gave up. Actually I still don't understand them -- they seemed to be context-dependent. It was amazing -- one unit test would fail and the other would succeed. I make a change, and now it's swapped! The former succeeds and the latter fails. > (5) I see there is a synchronization lock in the invoke method. Since this > part of the code is executed by a single task (the main task), I'm not sure > I see the need for a lock. Am I missing something? Calling invoke inside invoke seemed to be a problem, both practically and theoretically. Practically, the computation tree has already been built, but now someone wants to build a new one with possibly overlapping nodes. On the theoretical side, I said in the readme: Parallelizing code means surrendering control over the micro-management of its execution. Manually invoking tasks inside other tasks is rather contrary to this notion, throwing a monkey wrench into the system. It seems the parallelizer cannot make good decisions if the user is allowed to rearrange the furniture on a whim. But I leave the door open here -- I haven't fully considered invoke inside invoke. > (6) Have you tried running any of this under Ruby 1.9? The CompTree unit tests will core dump 1.9. But those unit tests pound extremely hard on the system, running many threads with many forks on large trees. It crashes cygwin too. On darwin I have to catch EAGAIN signals, but the tests all succeed. I designed it this way, of course, to see if I could shake out any race conditions or whatever multi-threaded problems that might exist. CompTree might be OK in 1.9 for a sunday drive. I haven't tried it with small Rakefiles cases yet. > That's all for now. Again, thanks for the work you put into this. I'll > probably have more questions later. And thanks for Rake. It is a pleasure to be involved. James M. Lawrence _______________________________________________ Rake-devel mailing list [email protected] http://rubyforge.org/mailman/listinfo/rake-devel
