Tels wrote:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Moin,

On Wednesday 12 July 2006 03:13, David Golden wrote:
Tels wrote:
My idea was to build _only_ the database, and do it right, simple and
easy to use and then get everyone else to just use the DB instead of
fiddling with their own. (simple by having the database being superior
to every other hack thats in existance now :-)

I even got so far as to do a mockup v0.02 - but then went back to
playing Guildwars.

Is this a project that would be of general interest?
At YAPC::NA, Adam Kennedy mention that he wanted to try to make some
headway on CPAN::Index, which sounds very similar in intent.  While it's
not released, you can see the formative project at his public repository:

http://tinyurl.com/g888h

Perhaps you can join forces with him and help push some collective
project towards a release.

Quoted:

B<CPAN::Index> provides object-oriented access to the CPAN index,
using a collection of relatively common modules, and automates
entire process of fetching and accessing the index.

Uhm, no the DB should maybe be able to have a front-end that fills it from the CPAN index, but you should also be able to build a DB from a local file, if you so wish.

Well of course, and that's what it does.

If has a database structure with a DBIx::Class layer to the schema for a CPAN index, and it handily comes by default with modules to download the main CPAN index and stream it into the database.

If we wanted to add a different mechanism to populate the same database, that would be just fine. In fact, part of what I want is to create an equivalent of CPAN::Inject, to be able to build out a custom "modified CPAN" that also includes commercial modules.

The index is stored in a L<DBD::SQLite> database file, with an object
model implemented around it using L<DBIx::Class>. To update the index,
the L<CPAN::Index::Loader> class implements the logic to flush and reset
the database, fetch the index files, parse them, and repopulate the
database.

The DB backend shouldn't matter at all, it should be transparent and be switchable without any noticable change at the front.

Yep, right with you. Hence DBIx::Class.

Plus, I planned to
use YAML because it creates a _much_ less heavy overhead and dependency chain. Using SQLite or similiar is what really creates the problems with CPANTS - you cant just access the raw database without the front-end.

Erm, I'm not sure I get you here. The main problem with all the things that loads the index currently is that they tend to consume 50-100 meg of ram, and that's just the active index. Add in extra stuff and you're up into stupid amounts of ram quite quickly.

WithYAML, at least you could get at the data by other means. Of course for performance reasons it might not be good, but since premature optimization is the root of all evil, I'd say use YAML now, change when nec.

It's already necesary, based on the memory load. That's not to say you have to use SQLite to DISTRIBUTE or publish the data, just that when you access and manipulate it computationally, you do it via SQLite.

This method has worked spectacularly well in the JSAN code, which implements something very similar to what I'm heading towards. To do dependencies and so on, it just links an Algorithm::Dependency object into the DBIx::Class code, and there you have your full range of graph-math things, like dep chains (up and downstream) weights, etc.

Otherwise, the projects seem similiar in scope, except that I focus on the DB and let things like "download stuff" be done outside.

Whether that works out, uh I don't know. In any event I have quite a few ideas I'd like to try out and this proves to be fun to me. I hope it doesn't end upI have to implement other people ideas - thats too much work like :D

Have a look at just the database parts, which I've finished already from memory.

The bit that isn't finished is the complete "grab your CPAN.pm config and stream down all the files and populate the database".

Adam K

Reply via email to