On 12/14/06, Stephen Potter <spp at unixsa.net> wrote: > > > I think you need to go a little lower first. A size limit isn't useful > if the contents need to be sorted. So, an easy first question is, does > the file need to be sorted?
Currently, yes. It's then possible to locate an entry in it using a binary chop. Not everything that can use this does (if pkgchk did it could go *way* faster), but it's a significant optimization. If not, then does the file need to be ASCII > or would a DB file be better? > I'm not convinced by the DB idea. One snag is that you can end up requiring lots of random I/O rather than lots of sequential I/O as you get with the contents file. However, I can imagine an alternative format could be beneficial. Such a format could be more compact and easier to parse - and actually processing the contents file is quite expensive (in terms of both cpu and memory - there's quite a lot of memory pressure coming from the package tools, as the internal representation of the contents file is several times the size of the contents file itself). As a simple example, just storing the filename and not the full pathname for files (the file is sorted, so the directory path is the last directory you saw) could save 40% of the file size. -- -Peter Tribble http://www.petertribble.co.uk/ - http://ptribble.blogspot.com/ -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/install-discuss/attachments/20061214/879b6a26/attachment.html>
