Tim Bunce <[EMAIL PROTECTED]> wrote: > On Tue, Oct 28, 2003 at 02:37:29PM +0000, [EMAIL PROTECTED] wrote: >> >> I ran some more tests, some of which might be more significant: >> >> time(sec) db size (kB) peak RAM (MB) >> no coverage 15 --- ~ 10 >> Data::Dumper+eval 246 245 ~ 23.4 >> Storable 190 60 ~ 19.7 >> no storage 184 --- ~ 18 > > Excellent. From 23.4-18 to 19.7-18 is 5.4 to 1.7. So Storable is > taking only 30% of the time that Data::Dumper+eval took.
You're looking at the column for peak RAM usage. The time difference is 62 (246-184) vs 6 (190-184). So Storable is taking about 10% of the time that Dumper+eval took. File IO is now pretty insignificant next to the overhead of doing coverage. Hopefully that number will come down some, eventually. >> Eventually, I think that a transition to a real database (where >> you can read/write only the portions of interest) would be good. > > How would you define "portions of interest"? The files you're actually adding/updating coverage for. Right now, if your cover_db holds data for a dozen files, but you test them one at a time, you have to read and write *all* the coverage data (as well as have the RAM to hold it). That's a lot of unnecessary work and wasted memory. > Certainly some changes are needed in the higher level processing. > But there's possibly no need for a "real database" (if you mean > DBI/SQL etc which carry significant overheads). Multiple files, for > example, may suffice. I did have DBI in mind, but I'll agree that the overhead must be considered. If multiple files or a tie() to disk works and is faster, I'm all for it. -mjc