OK, I've made some progress here, but the results are not very good.

I tested out a patch to efs-perl that treats versions as v-strings,
but that's just a trade off.  That algorithm can sort HTML-Templates
versions, but then it breaks on Module-Build and DateTime.  There
doesn't not exist a single algorithm that correctly implement version
sorting automatically.  I am convinced we have no choice here but to
manage this with project-specific metadata.

Now, as noted below, it's easy to manage this in efsdeploy's config
data, however that information is intended to be a BUILD-time
dependency ONLY.   I do not, under any circumstances, want that data
to have RUNTIME-dependencies.

The only realistic solution I can see is to have efs-perl read this
information from something like:

    /efs/dist/perl5/EFS-Perl-Config/current

IOW, we'll still use an autorelease (the EFS 3 replacement for incr
releases -- they kick ass), but it will be compiled from the
deploy-config data, and distributed separately.  In reality, we will
only need to specify a special sort order as an exception in the perl5
metaproj, so I expect this data to change fairly infrequently.  The
deploy-config stuff, OTOH, will change a LOT, and this will allow us
to update the deploy-config stuff as often as we want, with no risk to
your production systems.

This seems reasonable for several reasons.  First, remember the edge
condition this applies to.  When EPD is expanding a dependency tree,
and there are two DIFFERENT releases specified at the same level of
the dependency tree, then and only then does this code kick in.
Honestly, we're splitting hairs here, but it's important to get this
right.

I'm working through a rebuild of my perl5/core release, to introduce
the perl5/EFS-Perl-Indirect project, and this is a good time to code
something up for EFS-Perl-Config, which just be a couple of hooks in
some efsdeploy rules.   It should be straight forward to have efs-perl
query this information.

On Mon, Jan 9, 2012 at 12:03 PM, Phillip Moore
<[email protected]> wrote:
> I'm in the middle of totally reworking how dependencies are managed in
> the efsdeploy code, and I've finally run into a problem I've been
> avoiding for a while.  How do we manage the metadata that specifies
> how a project's releases are sorted?
>
> This is a LOT harder than you might think.  First of all, the main
> problem is that we need this chunk of metadata in several places.
> The first time this problem reared it's head was EFS::Perl::Depends,
> which was really the first time we had to think about this.  The first
> major release of EPD got it totally wrong, and used Sort::Versions to
> sort everything, which is really only correct for a subset of the CPAN
> distros.  In V2 I fixed this by switching to version objects, which is
> the correct default, but then you discover that there are distros like
> HTML-Template that use versions which do NOT sort correctly with
> version.pm.  Those, you have to use Sort::Versions for.
>
> This defines an important requirement for managing this information:
> it has to be available to EPD, and be CHEAP.   EFS database attributes
> would be a nice place to manage this, but we can't have
> sitecustomize.pl querying a database for every perl invocation.  Right
> now, EPD is dependent only on the metadata.conf text files found in
> each release.  Wherever we end up managing this information, it will
> have to be written out into these metadata.conf files by efsdeploy
> (easy).
>
> RIght now, I see 3 distinct sorting algorithms.  Here's the keywords
> I'll be using in the code to describe them:
>
> [*] gnu (Sort::Versions)
>
> Every single project on gnu.org uses a versioning scheme that sorts
> correctly with Sort::Versions, and this will probably work for most
> open source projects that use a major.minor.subminor release
> convention.
>
> [*] perl (version objects)
>
> MOST of CPAN can be sorted using version.pm objects, with a few
> exceptions we'll have to configure, as described above.
>
> [*] text (natural perl "sort" order)
>
> This is how you would sort, say, EFS autoreleases (YYYYMMDD-HHMMSS),
> for example, or anything that uses a convention that can be sorted as
> plain text.  Autoreleases are a stupid example, though -- you always
> access them (BY DESIGN) through the current releasealias.
>
> So, where do we manage this data?  We want to be able to specify it in
> ONE place, and then have it propogate wherever else it is needed as a
> by-product of the automated efsdeploy workflow.  Seems obvious to me
> this project-specific information that should be in the deploy-config
> rules, therefore in globals.conf:
>
> [config]
>    sort = gnu | perl | text
>
> This will allow us to specify system defaults, so the gnu system would
> obviously default "gnu", perl5 would default to "perl".
>
> This would make things easy for efsdeploy, which is doing the
> expansions per-project.   However, for EFS::Perl::Depend (the only
> other consumer of this information right now), it will be a little
> trickier, since it currently just assumes "perl" and falls back on
> "gnu" (i.e. using the Sort::Versions algorithm) only when the releases
> can't be converted into version objects.
>
> There are two approaches I could take with EPD.   First, since we now
> (as of last week) have all the project-specific efsdeploy rules
> available in the various */deploy-config-* projects, it might be
> possible for EPD to query that data.     The second approach would be
> to only worry about this when we have to.   For example, if a project
> only has one release, then there's nothing to sort.  Only if a project
> has two or more releases do we need to query the sort algorithm.   In
> this case, we could read the metadata.conf for all the releases, make
> sure they are the same and then sort.  if the releases support
> different algorithms, abort, and require the dependency to be
> specified manually.
>
> Now, with either solution, we're looking at more filesystem overhead.
> I'm leaning towards this algorithm:
>
> If project-specific efsdeploy rules exist, then use that sort algorithm
> If not, use the current algorithm (try version.pm, fall back on S::V)
>
> I think the IO overhead, as long as we memoize the values for each
> project, should prove minimal for EPD.
>
> If anyone's watching, you'll see some commits over the next few days
> that implement all of this.
_______________________________________________
EFS-dev mailing list
[email protected]
http://mailman.openefs.org/mailman/listinfo/efs-dev

Reply via email to