Eric and I have had a intense off-list discussion over some design issues for inc/ bundling. I suspect a number of our differences of opinion may stem from different starting points or assumptions about the design, so I'd like to step back, be structured in my thinking about the issues and seek the input of the broader list. I think it's important that we have a strong consensus on desired *behaviors* and that we identify and address edge cases. Once we have that, I think reaching consensus on the implementation will be much easier.
Context on inc::latest for those not familiar with it. inc::latest is a module that would exist as ‘inc/latest.pm’ in a distribution directory. It will be used to load the “latest” version of a module, whether bundled in inc/ or installed in the ordinary perl library path. E.g. “use inc::latest ‘Module::Build’” will load M::B from inc/ if it doesn’t exist on the user’s system or if the version in inc/ is newer than what is on the user’s system. This allows a distribution to rely on functions of newer M::B’s even if a user doesn’t have it installed and doesn’t have a CPAN(PLUS) that understands configure_requires. The way inc::latest works is by conditionally adding a specially-named directory in inc, e.g. inc/inc_Foo-Bar, to @INC, but only if inc_Foo-Bar contains a newer version of the requested module than is found in the initial @INC. It does *not* just add ‘inc’ to @INC. Clarifying some terms and assumptions for the descriptions below: * I'm going to assume a "bundle_inc" action that is responsible for the act of bundling module files into special directories in inc. (It would get called during the "distdir" action, most likely.) * When I refer to $source, I mean the top-level source directory -- e.g. the one checked out from a repo * When I refer to $distdir, I mean the Foo-Bar-1.23 directory created under $source when the "distdir" action runs * When I refer to “inc/inc_$foo” or “inc/inc_*” I mean one or all of the specially-named directories in inc/ used by inc::latest. Design issues for discussion: (1a) Should bundling modules into inc/ happen in $source or $distdir? Currently, most of the M::B created content (e.g. README, LICENSE, META.yml, and Makefile.PL) happens during the "distmeta" action and they are created in $source and added to the MANIFEST if needed. Only afterwards are files in MANIFEST copied to $distdir. The exception is SIGNATURE, which is generated within $distdir (and the MANIFEST in $dist_dir is updated to match). Bundling into inc/ in $source would be consistent with approach for other generated files. Bundling into inc/ in $distdir would better isolate bundled modules for end-users from code that is executed by authors (1b) For either choice in #1a, how should the behavior of “bundle_inc” change depending on whether $source/inc exists or is missing? If bundling inc/ happens in $source, then a missing $source/inc would result in it being created. But should an existing $source/inc be completely removed and then regenerated? Or should only the bundled files within $source/inc be removed and regenerated? If bundling inc/ happens in $distdir, then a missing $source/inc has no effect. But if $source/inc/ exists, is that an error? Should it be copied to $distdir and then bundling happens into $distdir/inc? Should bundling be done first and then $source/inc files are copied into $distdir/inc? Put differently, can a user have a $source/inc (e.g. for a custom M::B subclass) and still have bundling or not? If so, how do we make sure they play nice together? (2a) Should inc::latest work for any module at any depth in a directory bundled in inc/? (E.g., use inc::latest ‘File::Spec::Functions’) Or should inc::latest only be allowed for a “top-level” module? (e.g. use inc::latest ‘File::Spec’) If it works for any module at whatever depth, then inc::latest must search *all* directory bundles in inc/ for one containing the requested module. If the module is found in inc/inc_* and has a higher $VERSION than in @INC, the bundled inc/inc_$foo in unshifted to @INC and the module is loaded. (Note: when this happens means *all* modules in inc/inc_$foo will subsequently take precendence over system-installed modules. E.g. File::Spec::Functions loads File::Spec but as long as they are in the same bundle, the correct ‘latest’ version will be loaded.) If it works only for a top-level module, then inc::latest checks if inc/inc_$foo exists and whether the module there has a higher $VERSION number than in @INC. If so, inc/inc_$foo is unshifted to @INC and the module is loaded. (Again, all modules in inc/inc_$foo will now take precedence, not only the one initially loaded.) This means that an author may have load a top level module with inc::latest just to set up @INC to then use a module that is deeper in the tree. E.g. “use inc::latest ‘File::Spec’; use File::Spec::Functions” (2b) Should modules to bundle be based on configure_requires or modules actually loaded by inc::latest or by some other specification? How does this depend on the answer to #2a? If inc::latest only works for a “top-level” module, then “use inc::latest ‘Foo::Bar’” is really asking for the latest “distribution” (Foo-Bar) to take precedence in @INC and to load the eponymous module (Foo::Bar). The Foo-Bar distribution is what should be bundled and a list of modules loaded by inc::latest can be used for bundling (and population of configure_requires). See #3 for issues of mechanics around bundling sets of modules. If inc::latest works for a module of any depth, then the requested module does not necessarily directly map to a distribution, and thus it’s more difficult to determine the set of modules that should be bundled from the list loaded by inc::latest. In this case, configure_requires could be used to indicate the ‘distribution’ equivalents to bundle, even if inc::latest loads a deeper module. E.g. configure_requires could have File::Spec even though inc::latest is loading File::Spec::Functions. (This becomes relevent for M::B if someone want to use inc::latest ‘Module::Build::Functions’ instead of Module::Build) (3) How should we map from a module name to bundle to a set of module files? Whether derived from configure_requires or inc::latest loaded modules, we have a list of module names but need a set of related module files that must be copied to the right place in inc/. The purest solution would use a CPAN index to identify the distribution that provides the given module name. Then that distribution could be bundled in a variety of ways. The most complex, yet robust option is to use a CPAN client to install it into inc/inc_$foo with some install_base customization. An alternative is to insist that modules to bundle be installed from CPAN (not just core) so that a packlist file can be used to identify modules to copy. This is a reasonable solution, except that identifying the packlist location is not always trivial. E.g. File::Spec is contained within the PathTools distribution, but the packlist is installed under the name Cwd. A pragmatic solution would be to recursively copy all files under the top-level module name. E.g. for File::Spec, copy File/Spec.pm and File/Spec/** into inc/inc_File-Spec. This avoids needing to rely on CPAN indices or clients, but risks including unrelated modules under the same namespace or excluding related dependencies in different namespaces that are normally installed together. E.g. all the different package names in the libwww-perl distribution: File*, HTML*, LWP*, Net* and WWW* (4) Should inc::latest emulate “require” or “use”? If “use”, should it be able to handle the full semantics of “use MODULE VERSION LIST” and how? Right now, inc::latest require()’s the module and calls import() with zero or more arguments. It does not support *not* calling import and does not support VERSION checking. Are these important? Not getting default imports() could be important. Should that just be an explicit undef? “use inc::latest ‘Foo::Bar’ => undef” That's all from me for now. For anyone who read this far, thank you. Your thoughts and reactions are greatly appreciated. -- David