Plus, I planned to
use YAML because it creates a _much_ less heavy overhead and dependency
chain. Using SQLite or similiar is what really creates the problems
with CPANTS - you cant just access the raw database without the
front-end.
Erm, I'm not sure I get you here. The main problem with all the things
that loads the index currently is that they tend to consume 50-100 meg
of ram, and that's just the active index. Add in extra stuff and you're
up into stupid amounts of ram quite quickly.

I am not sure what "stupid" consists, but my system wouldn't have problems handling 512 MB of memory.

OTOH, if the raw data in memory is like 100 Megabyte, then where does all the data come from? The CPAN index is surely not 100 Megabyte, right?

Raw data inflates into memory, once you factor in Perl's overheads. CPAN.pm uses about 50meg to do it, CPANPLUS used to use about 80meg, although kane has pulled that back a bit now I believe.

And while doing this temporarily for the purposes of installing a module might be fine, I have use cases in mind like running a CPAN index inside a Perl editor.

And it would most certainly be "stupid" if your editor uses 250 meg of RAM.

It's already necesary, based on the memory load. That's not to say you
have to use SQLite to DISTRIBUTE or publish the data, just that when you
access and manipulate it computationally, you do it via SQLite.

Well, I guess you have more experience on that than I have. I would have tried without it first.
This method has worked spectacularly well in the JSAN code, which

What is JSAN?

implements something very similar to what I'm heading towards.

Er, so why aren't you using what JSAN is? *confused*

JSAN is the JavaScript Archive Network, the port of CPAN for JavaScript.

The implementation was a nice chance to blow away all the CPAN cruft and try some new ideas out, without having to care about back-compatibility.

Although I'm reusing the same techniques (SQLite + ORM layer) for CPAN::Index, I can't reuse the code directly, because there are different language concepts involved, so the XPAN client for each needs to work differently as a result.

To do dependencies and so on, it just links an Algorithm::Dependency object

Another dependency to find out what the dependency is.

I know you don't like dependencies, but your choices come down to dependency or reimplementation, if you want some feature. And as long as the dependency has 100% platform support and never fails to install, it isn't a problem.

The reason CPAN is in the current situation of "dependency == bad" is primarily (but not exclusively) that many modules fail on various edge cases, and those edge cases have combined effect. Throw in dependency recursion and unless you are paranoid about the things you depend on being clean, you run into trouble.

Nobody would care about dependencies if they never failed (except for the issue of installation time).

The bit that isn't finished is the complete "grab your CPAN.pm config
and stream down all the files and populate the database".

I see you have don muchmuch more work and given it much much more though,
so I better just shut my mouth, throw my silly alittle attempt at yet-another-wheel into the dustbin and go away. :-/

Good luck with CPAN::Index.

I'm all for more wheels, I'm just pointing out some issues you might run into. And if yours exists and mine doesn't, well yours is far better than mine.

Adam K

Reply via email to