Hello all,

I have recently implemented a build system based on Rake for a decently-sized project (~8000 source files) which builds code and data for several libraries and applications targeting four separate platforms.

I was new to both rake and ruby when I started; and it's a testament to the awesomeness of both that I was able to learn them quickly to be able to implement a fairly complicated system in less than a month.

Anyway, I wanted to pass along some comments.

Firstly, when I first got stuff building with rake (replacing a build environment that used gnu make) I noticed that a null build (where everything was up to date) took a really long time. I used the ruby profiler and found that approx. half the run time was spent doing file stats (exist? and mtime), with an average of 70 calls *per file* during a rake run. As it runs under Cygwin on Windows those file stat operations are more expensive than they are on Linux. I should note that I'm using ruby 1.8.7 and the rake that gem installed (0.8.7).

If you look at the FileTask code, the needed? method calls File.exist? then timestamp, which calls File.exist? then File.mtime. That's three stats in a row right there for the file itself; then each prerequisite is asked for its timestamp which generates two stats for each (for FileTasks that is). My approach was to create a simple global cache that uses the filename as the key and stores the file's mtime.

module Rake
    # Modify Rake's FileTask to use our cached file tests
    class FileTask < Task
        def needed?
            ! File.cached_exist?(name) || out_of_date?(timestamp)
        end

        def timestamp
            if File.cached_exist?(name)
                File.cached_mtime(name.to_s)
            else
                Rake::EARLY
            end
        end

        def execute(args=nil)
            ret = super
            File.invalidate_cache(name)
            ret
        end
    end
end

The invalidate_cache method simply deletes the cache entry for the given file; which is necessary if the file was changed by the task's action. The execute method does this for the FileTasks's own target; if an action creates or modifies other files as a side-effect, I explicitly call invalidate_cache in the action block for each side-effect file to make sure the cache doesn't contain any stale info.

This change resulted in an order of magnitude improvement in run-time.

Now, I'm not saying rake should adopt this specific optimization; however I think you should consider some type of caching to reduce the quantity of exist?/mtime calls. Perhaps the FileTask could simply cache its own exist?/mtime results (invalidated when execute runs).

I have more to say about dependency generation and multitasking, but I'll send those thoughts in separate emails.

-Heath



______________________________________________________________________
This email has been scanned by the MessageLabs Email Security System.
For more information please visit http://www.messagelabs.com/email ______________________________________________________________________
_______________________________________________
Rake-devel mailing list
[email protected]
http://rubyforge.org/mailman/listinfo/rake-devel

Reply via email to