Quoting Lukasz Szybalski <[email protected]>: > I guess the question would be: Could you describe the type of data you > currently have. (percentage of space, downloads, changes) >
This is the directory that has broken the system (watch out-- it may break your browser): http://ukparse.kforge.net/svn/undata/pdf/ It's several thousand large PDFs of UN documents. The same would apply to scanned images, archived pages from Hansard, etc. At the moment I'm storing it in SVN as a means of distribution, but it unnecessarily doubles the disk useage, and some of the SVN clients are very unhappy with the size of the directory. SVN is entirely inappropriate for these large binary files (there are no versions), but it's convenient only because the code that handles these binary files are in SVN (where they belong), and the fewer means of distribution the better. But it's not scaling any more. We need a better answer for parking the data for these projects, where we'd keep the scraping/parsing code in SVN on kforge (SVN is designed for code), and handle these large sets of large non-versioned files some other way. Julian. _______________________________________________ okfn-discuss mailing list [email protected] http://lists.okfn.org/cgi-bin/mailman/listinfo/okfn-discuss
