So long as I'm waiting for freenetproject.org to come back up so I update my 
nodes properly and get past all the non-backward-compatible changes, I'd 
like to share some thoughts with you.

The number of freesites is getting unwieldy.  In order to have an index that 
will be useable in the future, we need to think scalability.  This breaks 
down into the following requirements:

1. It should be completely automated.  No human intervention.  Humans have 
better things to do.

2. It must sort the sites by category.

3. It must be possible to add new categories, in a heirarchy.

4. No unnecessary centralization.

I thought of adding a requirement that it be resistant to spamming and DoS, 
but I have no ideas for how to make that happen, so never mind.
p so I update my nodes properly and get past all the non-backward-compatible 
changes, I'd like to share some thoughts with you.

The number of freesites is getting unwieldy.  In order to have an index that 
will be useable in the future, we need to think scalability.  This breaks 
down into the following requirements:

1. It should be completely automated.  No human intervention.  Humans have 
better things to do.

2. It must sort the sites by category.

3. It must be possible to add new categories, in a heirarchy.

4. No unnecessary centralization.

(I thought of adding a requirement that it be highly resistant to spamming 
and DoS, but I have no ideas for how to make that happen, so never mind.)

Here's my idea:

Start with these categories:

* Meta/incestuous.  Sites that are about Freenet itself in some way.
* Opinion/politics
* Vanity blogs
* News
* Scientology (yes, it probably needs its own top-level category)
* Non-porn entertainment
* Warez
* Porn

For each of these, create a KSK submission queue, using the NIM
mechanism.  Actually, that's just a matter of desingating a subset of the 
KSK namespace.

Entries to such a queue consist on one line, containg the SSK to the site.  
What about descriptionb and ActiveLink?  Here's how we handle them:

Each site optionally has an activelink in the mapfile, named activelink.png, 
activelink.jpg or activelink.gif

Each site optionally has a file in the mapfile, containing a brief 
description, named description.txt

For every category, there a cron job or daemon running somewhere, to
fetch the submission queue and maintian the index page as a DBR.  This
job runs freely avaible source, so anyone who wants to can set one up.

The daemon reads the SSK's from the queue, and stores them in its
database.  It goes through the entries, trying to fetch those that
haven't been successfully fetched, with priority given to newer
entries.  It tries to get a title from the index.html, the activelink
and description as described above.  It makes an entry in the
directory with as much of these as it can get, so long as at least the
index.html is fetchable.  Truncate the description if need be.

Duplicates should be filtered.  A new duplicate entry overrides an
existing one.  This is a good mechanism to update the index with changes
to a site's activelink, title or description.

New editions of edition-based sites
can be recognized automatically so long as the names use the slash.
(Okay, I haven't been doing that.  But i'm willing to start.)

Anyone who wants to can create a category and set up a daemon to
maintain its page.  He can let others know about by submitting his
index to one of the pre-existing indexes.  This in effect lets us grow
a heirarchy of categories.

It's even possible to set up more than a new page drawing from an existing
submission queue.  This might be useful if a particular index's daemon
is unreliable.


-------------------------------------------------
SSK at jbf~W~x49RjZfyJwplqwurpNmg0PAgM/marlowe//

_________________________________________________________________
Send and receive Hotmail on your mobile device: http://mobile.msn.com


_______________________________________________
devl mailing list
devl at freenetproject.org
http://hawk.freenetproject.org/cgi-bin/mailman/listinfo/devl

Reply via email to