On Sat, Dec 15, 2012 at 11:59 PM, Michael G Schwern <[email protected]> wrote:
> We have a lot of serious problems because we lack a database of installed
> distributions, releases and files. There are serious problems with
> implementing one given A) the limitations of the standard Perl install and B)
> wedging it into existing systems. But I think I have a solution. Its similar
> to how meta data was slipped into the ecosystem without requiring authors to
> rewrite their releases or install a bunch of extra modules. It just happens
> as part of the normal CPAN module upgrade process.
>
> I've been thinking that a minimal package database could be created by putting
> some hooks into ExtUtils::Install::install(), which every Perl build system
> ultimately uses, to record what gets installed. That way when
> ExtUtils::Install is upgraded, the user gets a build database without
> upgrading everything else.
>
> This would be a fairly straight forward process at install time...
>
> 1) Copy everything to a temp directory
> 2) Record everything in that temp directory
> 3) Copy everything from temp into the real location
>
> You could probably optimize this by skipping the copy to temp and just have
> install() record stuff as it goes by, but this is the dumb, simple, robust way
> to do it.
>
> Storage is a problem. The only reliable "database" Perl ships with is DBM, an
> on disk hash, so we can't get too fancy. It might take several DBM files, but
> this is enough to record information and do simple queries. What are those
> queries?
>
> * What version of the database is this?
> * What distributions are installed?
> * What release of a distribution is installed?
> * What files are in that release?
> * What version is that release?
> * What location was a release installed into? (core, vendor, site, custom)
> * What are the checksums of those files?
>
> And the basic operations we need to support.
>
> * Add a release (ie. install).
> * Delete a release (and its files).
> * Delete an older version of a release (as part of install).
> * Delete an older version of a release, only if its in the same release
> location. This is so CPAN installs don't delete vendor installed modules.
> * Verify the files of a release.
> * List distributions/releases installed.
>
> It would also store the MYMETA data which gives us a lot of information (such
> as dependencies) for free.
I can agree with all of that. Actually, starting a discussion about
this was on my todo-list for the last QA hackathon but I didn't get
around to it. Ideally, it should replace not only packlists but also
perllocal
> This is all totally doable, and efficient enough, with a small pile of DBM
> files and Storable. Where to put the database is a bit more complicated, see
> the list of open problems below.
Given that Storable's format isn't forward-compatible, something more
stable such as JSON would be more appropriate.
> There's lots and lots and lots of additional information which could be stored
> and queries and operations to allow, but if we can get the basics working
> it'll allow a heap of new solutions. And I think this is a SMOP.
>
>
> Future possibilities include...
>
> * Auto-upgrade to SQLite if ExtUtils::Install::DB::SQLite is installed.
>
> If a special module is installed we can offer SQLite support (or whatever) for
> a more advanced database. At install time it would copy the existing DBM
> system into its own database.
>
> In general, more functionality can be added as more optional (or bundled)
> dependencies are available to the system. Through it all the basic DBM
> database would continue to be redundantly maintained to provide a fallback
> should those optional modules break or go away.
Having a proper database would be really nice, but I'm not sure if
it's going to be worth the hassle if we have a robust system already.
> * Upgrading the database.
>
> I'd like to put some thought into how things are laid out initially to avoid a
> lot of major revisions, and thought into what information should be recorded
> so its available later, but eventually we're going to want to change the
> "schema", such as it is with DBM files.
>
> I figure this can happen as part of upgrading ExtUtils::Install. It checks
> what version of the database you have and performs the necessary transforms to
> bring it up to the current version. We know how to do this, just have to keep
> it in mind and remember to implement it.
>
> * Where to put the database? What about non-standard install locations?
>
> $Config{archlib} would seem the obvious location, but it presents a
> permissions problem. If a non-root user installs into their home
> directory, you don't want them needing root to write to the installation
> database. There's several ways to deal with this.
>
> One is to simply not record non-standard install locations, but this loses
> data and punishes all those local::lib users out there.
>
> Another is to have a separate install database for non-standard install
> locations. This makes sense to me, but it brings in the sticky problem
> of having to merge install databases. Sticky, but still a SMOP. Once you
> have to implement merging anyway, it now makes sense to have an install
> database for each install location. One for core. One for vendor. One for
> perl. And one for each custom location. This has a lot of advantages to
> better fit how Perl layers module installs.
>
> * allows separation of permissions
> * allows queries of what's installed based on what's in @INC
>
> That second one is important. When a normal user queries the database, they
> want to get what's installed in the standard library location. When a
> local::lib user queries the database, they want to get what's installed in the
> standard library locations AND their own local lib.
The combination of these is problematic. You might upgrade EU::Install
in your local module path, but not have write permissions on the
system paths. In practice, we might have to support all our older
versions :-|
> Not perfect, but gets us off the ground. Its not a great database, but it
> does the important job of recording the critical install-time data for later
> use. Its implementable within the current system. It doesn't require a bunch
> of dependencies, just one upgrade. It works with most existing module
> releases. It solves a major design problem with the Perl module system.
>
> I think it's a Simple(?!) Matter Of Programming in ExtUtils::Install to get it
> off the ground. IMO the most important bit of coordination is putting some
> thought into what the basic database should look like so we don't have to
> worry about complicated upgrades later.
I'm not sure it's as simple as you make it sound, but it is a good
idea nonetheless.
Leon