Paul Rubin wrote:
Paul Rubin <http://[EMAIL PROTECTED]> writes:

How does it do that?  It has to scan every page in the entire wiki?!
That's totally impractical for a large wiki.

So you want to say that c2 is not a large wiki? :-)

I don't know how big c2 is. My idea of a large wiki is Wikipedia. My guess is that c2 is smaller than that.


I just looked at c2; it has about 30k pages (I'd call this medium
sized) and finds incoming links pretty fast.  Is it using MoinMoin?
It doesn't look like other MoinMoin wikis that I know of.  I'd like to
think it's not finding those incoming links by scanning 30k separate
files in the file system.

c2 is the Original Wiki, i.e., the first one ever, and the system that coined the term. It's written in Perl. It's a definitely not an advanced Wiki, and it's generally relied on social rather than technical solutions to problems. Which might be a Wiki principle in itself. While I believe it used full text searches for things like backlinks in the past, I believe it uses some kind of index now.


Sometimes I think a wiki could get by with just a few large files.
Have one file containing all the wiki pages.  When someone adds or
updates a page, append the page contents to the end of the big file.
That might also be a good time to pre-render it, and put the rendered
version in the big file as well.  Also, take note of the byte position
in the big file (e.g. with ftell()) where the page starts.  Remember
that location in an in-memory structure (Python dict) indexed on the
page name.  Also, append the info to a second file.  Find the location
of that entry and store it in the in-memory structure as well.  Also,
if there was already a dict entry for that page, record a link to the
old offset in the 2nd file.  That means the previous revisions of a
file can be found by following the links backwards through the 2nd
file.  Finally, on restart, scan the 2nd file to rebuild the in-memory
structure.

That sounds like you'd be implementing your own filesystem ;)

If you are just trying to avoid too many files in a directory, another option is to put files in subdirectories like:

base = struct.pack('i', hash(page_name))
base = base.encode('base64').strip().strip('=')
filename = os.path.join(base, page_name)

--
Ian Bicking  /  [EMAIL PROTECTED]  /  http://blog.ianbicking.org
--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to