Re: CPANDB - was: Module::Dependency 1.84

Adam Kennedy Wed, 12 Jul 2006 23:29:34 -0700

Tels wrote:

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


Moin,

On Wednesday 12 July 2006 03:13, David Golden wrote:

Tels wrote:

My idea was to build _only_ the database, and do it right, simple and
easy to use and then get everyone else to just use the DB instead of
fiddling with their own. (simple by having the database being superior
to every other hack thats in existance now :-)

I even got so far as to do a mockup v0.02 - but then went back to
playing Guildwars.

Is this a project that would be of general interest?

At YAPC::NA, Adam Kennedy mention that he wanted to try to make some
headway on CPAN::Index, which sounds very similar in intent.  While it's
not released, you can see the formative project at his public repository:

http://tinyurl.com/g888h

Perhaps you can join forces with him and help push some collective
project towards a release.


Quoted:

B<CPAN::Index> provides object-oriented access to the CPAN index,
using a collection of relatively common modules, and automates
entire process of fetching and accessing the index.

Uhm, no the DB should maybe be able to have a front-end that fills it fromthe CPAN index, but you should also be able to build a DB from a localfile, if you so wish.


Well of course, and that's what it does.

If has a database structure with a DBIx::Class layer to the schema for aCPAN index, and it handily comes by default with modules to download themain CPAN index and stream it into the database.

If we wanted to add a different mechanism to populate the same database,that would be just fine. In fact, part of what I want is to create anequivalent of CPAN::Inject, to be able to build out a custom "modifiedCPAN" that also includes commercial modules.

The index is stored in a L<DBD::SQLite> database file, with an object
model implemented around it using L<DBIx::Class>. To update the index,
the L<CPAN::Index::Loader> class implements the logic to flush and reset
the database, fetch the index files, parse them, and repopulate the
database.

The DB backend shouldn't matter at all, it should be transparent and beswitchable without any noticable change at the front.


Yep, right with you. Hence DBIx::Class.

Plus, I planned to

use YAML because it creates a _much_ less heavy overhead and dependencychain. Using SQLite or similiar is what really creates the problems withCPANTS - you cant just access the raw database without the front-end.

Erm, I'm not sure I get you here. The main problem with all the thingsthat loads the index currently is that they tend to consume 50-100 megof ram, and that's just the active index. Add in extra stuff and you'reup into stupid amounts of ram quite quickly.

WithYAML, at least you could get at the data by other means. Of course forperformance reasons it might not be good, but since premature optimizationis the root of all evil, I'd say use YAML now, change when nec.

It's already necesary, based on the memory load. That's not to say youhave to use SQLite to DISTRIBUTE or publish the data, just that when youaccess and manipulate it computationally, you do it via SQLite.

This method has worked spectacularly well in the JSAN code, whichimplements something very similar to what I'm heading towards. To dodependencies and so on, it just links an Algorithm::Dependency objectinto the DBIx::Class code, and there you have your full range ofgraph-math things, like dep chains (up and downstream) weights, etc.

Otherwise, the projects seem similiar in scope, except that I focus on theDB and let things like "download stuff" be done outside.
Whether that works out, uh I don't know. In any event I have quite a fewideas I'd like to try out and this proves to be fun to me. I hope itdoesn't end upI have to implement other people ideas - thats too much worklike :D

Have a look at just the database parts, which I've finished already frommemory.

The bit that isn't finished is the complete "grab your CPAN.pm configand stream down all the files and populate the database".


Adam K

Re: CPANDB - was: Module::Dependency 1.84

Reply via email to