Module Library - aka CPAN

Richard Hainsworth Sun, 31 May 2009 03:24:17 -0700

Before joining in the previous CPAN threads, here are some personal wishlists regarding what the perl6 version of CPAN should do. But in orderto get some distance from CPAN, I want to call it the Module Librarysystem. Some of the debate threads impact on the internal softwareenvironment, and here I think the internal and external environmentsmust be designed in harmony.

A library paradigm is different from an archive paradigm, although alibrary is a form of archive. However, a library is a single place to goto obtain information. It is set up to enhance searching and finding. Anarchive is a place where things are stored; the users of an archive knowwhat and where they are storing. Hence the difference between an archiveand a library is a set of assumptions about what a user knows. Archivesare essential for large libraries (often called stacks), but there mustbe standardised metadata about what is in each archive for it to beuseful for a library.


So my wish list. I would like:

1) a single place to go for modules that will help me solve programmingproblems;2) a variety of ways to look through modules and the information on them- reviews, documentation, dependencies - so that I can choose betweenthem and know what a choice entails;3) a variety of classifications of modules and module contents that canlead me to modules which may achieve what I want but are named in amanner I did not expect, or contain functionality I could use in anothercontext;4) a means to extract from the library the module(s) I want, (betterstill only the parts of the module) and all their dependencies;

5) the ability to access existing CPAN perl software.

Extending the library paradigm, there are generalist libraries, whichonly contain commonly requested books, and specialist libraries and alsolibraries in other languages. However, good libraries share systems forrequesting books from other libraries, even if they are not storedlocally. The difference between libraries then boils down to themetadata available to the user, eg., more accessible user catalogs,librarians who know about subjects and can point to related books, etc.

The library paradigm breaks down somewhat for software because there isan intimate link between "my" environment and the software I choose.Hence, it is important to consider the internal environment ("my"computer) as well as the external environment (the module libraries).

Not only is there the problem of different operating systems (*ix,windows*, mac, etc), there is also the possibility of outdated softwareinstalled, eg., different versions of languages, such as perl 5.8, perl5.10.

Within "my" environment, there are multiple possible locations fordifferent aspects of software, eg. binaries, configuration data (whichcan be global to all users and local for a single user), documentation,data, results. In a corporate environment, these locations may existacross a network. For example, documentation may be on a server,binaries and local configuration data on the local computer, resultsposted to a web site, input data taken from a database.


My wish list for the internal environment; I would like:

1) a single system to manage all my software modules;

2) a method to see where different aspects of software are located, andto change them;3) an ability to manage in parallel different dependency hierarchies(such as might result from a need to use modules from different"generations" (eg. perl 5.8 and 5.10).4) a comprehensive method to determine what my local environment is, sothat I can see whether it "matches" the environment required by softwarethat I want to download, and if not, what I need in addition to be ableto run the software.


Some commentary on the above:

I use the CPAN archive, CPANPLUS, and synaptic (a GUI frontend to theapt tools) for perl since I use Ubuntu/Gnome. In some cases I have toload binaries using synaptic because I cant get the sources to compileproperly from CPAN. I do not have the time or inclination to discoverwhy. The result is a mishmash of dependencies. Moreover, I cannot getupdated software without breaking something. This occurred when the oneUbuntu generation was distributed with perl5.8 even though perl5.10 wasout, and I really wanted to use some of the new software.

The result was that I had both 5.8 and 5.10 modules. Although mostthings worked, perldoc wasnt recognised by some Configure scripts. I amnot interested now in discovering the reason. I use this simply as anexample about how multiple distribution channels that make differentassumptions can lead to problems. But because there is no single methodfor monitoring the perl environment, it is difficult to solve some ofthe problems.

The current setup works extremely well, except when it doesnt. Mydifficulty is finding out how to solve the problems that occur. So Ithink there should be a system that normally works "out-of-the-box" withno user interference, but tools that can be used to identify and changea configuration when needed.


The source vs binary debate:

I would suggest that most users would prefer to have precompiledbinaries that work out of the box for common environments. The problemsof matching the software to an environment is a bit like getting a sparepart for a car, or a replacement for something broken at home. Thepossible infinity of choices given multiple manufacturers and systems issolved by having standards and selling parts that will match standards,eg. tyres or bolts with standard diameters.

Once a module has been decided on, you look to see if there is a binarythat matches your internal environment. If not, you have to roll yourown from source.

It would be a function of the library administration to set rules as tothe standards that a binary needs to meet to be filed in some category.


Photographs vs software:

The more flexible the design, the longer lasting. Libraries can containbooks on religion and pornographic magazines. The design of a librarysearch search or storage system does not define what can be put in it.There is a place for strong views about what should be in a library (eg.no pictures, only pure perl6). In terms of library design, the questionshould be "how can the user discover, in general, what a librarycontains". If you dont like the contents of a library, use another.

I believe it would useful to have multiple libraries, each of which hasa "mission statement" that would define what sort of information itcontains. Hence, libraries dedicated to windows only binaries, perl6modules, perl6/Ruby/Python interacting software.

If the system proves to be flexible, then the same library design andmanagement software could be applied to photographs, food recipes, etc.Just like the worldwide web was designed for scientific papers withembedded links, the design is so flexible it has changed the structureof our informational world.


Names and implicit assumptions:

There was a thread about CPAN6 6PAN Pause6, etc. It seems to me that theconflicting ideas underlying these names concern the user paradigm, thatis how does someone use the entire system.I believe that we might get more traction if we start by looking firstat the user paradigm, then different ways to provide the functionality.


I know that many of my wishes are fulfilled by the more detailed proposals.

However, the proposals seem to assume what the whole system should do.The idea behind this post is to make the user paradigm clearer, so thatthe strengths and weaknesses of different approaches can be compared.


One way of breaking out parts of the whole system might be:
a) Internal module management
(for example)

- provides to a perl program the physical locations for settings in %*ENV
eg. 'use MyModule::Submodule;'

might normally lead to (in unix)'/usr/local/share/lib/perl6/site/MyModule/Submodule.pm'but it could also be'/home/project_10/version_15/assistant_junior_hacker/Submodule.test'

or 'http://www.project_10.design.library/MyModule/Submodule.pm'

or 'select "MyModule_SubModule" from "standard_modules" wherelanguage="perl6" '

- allows the user to set an arbitrary environment variable, eg.%*ENV<April_dataset> so that inside the program it is possible to do

my $data = slurp(%*ENV<April_Dataset>);

without worrying where the binary is running, eg., in PADRE, or calledfrom the directory where the file is present. Also, the program could beagnostic about whether it is being run on Windows and the pathdefinition is 'MyDocuments\UserData\April_Dataset.csv' or unix'/home/hacker/perl6/April-09.DataSet.current'

- defines how software should be distributed in the system, eg. binariesin /usr/local/lib/perl6/site, documentation in/usr/local/doc/perl6/site, config files in /usr/local/etc/perl6/site,source files in /usr/local/src/perl6/site


- handles downloads from archives

- knows how to build binaries from src and install and test them.

- knows how to define the local environment taking into account modulesthat may be at a physical location not on the local computer.


b) Library sites

- presents information about modules, reviews, documentation, similarcategories,- exposes documentation information to "grok" about individual functionsin modules

- mission statement about the library
- accesses meta data from different archives
- generates the dependency information for a selected module.

- accepts user feedback about software modules, so that incompletedependencies can be noted etc- is able to extract information from existing CPAN archives andwrap/preprocess CPAN packages for perl6


c) Archive
Standards on:
- source, dependencies, binaries, documentations, reviews, classifications
- file storage

- standard definitions for environments in which a binary should beguaranteed to run (if stated dependencies are also available)


Hope this helps.

Module Library - aka CPAN

Reply via email to