A few additional rules for your consideration: - The data directory shouldn't be synced to debian releases, and ought to be paralled to dists, not main/contrib/non-free. (Since there are no executables, what's the benefit of syncing it, with the presumed multiplying of size and hassle? If a dataset needs a particular program or version, a simple dependency should be enough.)
A thought: Do we need to keep a source archive separate from the .deb? Almost all of these package are effectively their own source, and since we are talking about large datasets, the burden of keeping both seems unnecessary. Before you light up the flamethrower, I'm not promoting the idea of not releasing source. But consider a package that is basically a reproduction of a website. Do we really need two 7Mb packages who's fundamental difference is that they unpack into a different location? Instead, we could make the .debs act like installers, except that they would grab the external archive from the CD or via http. Or have a /usr/doc/<package>/debian, whose rules file has a way to copy the installed data into a new tree in order to rebuild the package. Steve