[Freenet-dev] Datastore formats and scalability

Travis Bemann Thu, 3 Aug 2000 22:14:51 -0400

On Thu, Aug 03, 2000 at 06:11:08PM -0500, Scott G. Miller wrote:
> Basically, all the design criteria for a datstore index are the same as
> for a general database indice.
> 
> >  1.  How do you efficiently search indices (remember that datastore
> >      indices are like the flat filesystems on old bitty boxen)?
> B+ tree, or a hash structure.


Actually, I was already thinking of using a binary tree.

> 
> >  2.  How do you efficiently add items to datastore indices (you can't
> >      go and write an entire half gigabyte index to file each time you
> >      add a single file (yes, this is an extreme example, but it *will*
> >      happen))?
> Any modern 2nd-level storage algo handles this.

Yeah, and that is what I am planning to do with nfreenetd.  Have you
implemented such an algorithm in the Java Freenet node?

> >  3.  How do you efficiently look for files to displace from the
> >      datastore when you add new files (yeah, like the other stuff in
> >      this list, it doesn't matter when you have a few hundred files in
> >      the datastore, but it soon becomes very significant when you have
> >      half a million files)?
> LRU stack seperate from the index.
> 
> >  4.  How do you break up files in the datastore and the datastore
> >      indices to get around file size limits imposed by filesystems?
> what?

You don't remember the 2 GB file size limit imposed by Linux?  Of
course, breaking up files and using deletion resistant coding would
get around this, but you need to make software as foolproof as
possible.  Yeah, breaking up files would solve this, but some idiot
with an SDSL line *will* eventually try to put a full uncompressed
DeCSSed version of The Matrix onto Freenet.  You need to consider this
case, even if you don't think it will ever happen.

> >  5.  Binary formats make datastores *easier* to work with (you don't
> >      have to rewrite the entire datastore each time you change the
> >      status of a file in the datastore), but you have to watch out for
> >      stuff like endianness.
> Duh.
> 
> > All of these issues have not been addressed yet, and they will be
> > significant problems once Freenet reaches primetime.  I'm planning on
> > not procrastinating on solving these problems, but instead solving
> > them from the first version of nfreenetd.  All of this stuff makes
> > writing a Freenet node daemon far more complex and difficult than it
> > appears at face value.
> 
> I have to challenge this assumption.  Datastores aren't meant to be this
> multi-gigabyte repository.  Running a site like flys in the face of the
> whole idea behind Freenet and distributed datastores.  You don't *need* to
> run these enormous centralized/webserver like nodes.  

This depends a lot upon Freenet's topology, which depends on
mechanisms which nodes find out about each other through.  If it takes
a UUCPNET type topology, you *will* have these sorts of nodes.
However, if it takes a topology that is not based on a backbone, you
will have *less* of these sorts of nodes.  One important factor is
whether there are a small but significant number of nodes with large
datastores and big pipes.  The more of these types of computers and
the less computers in between this sort of box and the normal user's
desktop box, the more Freenet will take a UUCPNET or Usenet type
topology.

> In fact, I think a lot of your criticisms really don't make much
> sense.  You're not designing a webserver.  It doesnt *have* to scale to
> enterprise levels.

Okay, so what if I have a pile of money I want to waste and I use it
to buy a nice quad Pentium III box with a nice big 40 GB hard drive
and a really nice video card (for playing Quake III Arena)?  I have
all this hard drive space to spare, so I devote 20 gigs to a Freenet
node datastore.  I also have a real nice SDSL line, and I don't
download piles of pr0n and music all day, so I dedicate a large amount
of bandwidth to the Freenet node.  People who know that I have a fast
computer with a ton of bandwidth and a massive datastore will start
hooking up their local nodes to my node.  Next thing you find, my
computer would get bogged down if it wasn't for my nice scalable
Freenet node implementation.  Yeah, this would not happen in the ideal
case, but the ideal case rarely occurs in most situations.  In a
situation like this one, you do want a very scalable Freenet node.
Yeah, this might not be likely to happen in your mind, but it *will*
happen.

Hackers are racing

-- 
Travis Bemann
Sendmail is still screwed up on my box.
My email address is really bemann at execpc.com.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 3405 bytes
Desc: not available
URL: 
<https://emu.freenetproject.org/pipermail/devl/attachments/20000803/aebee2d4/attachment.pgp>

[Freenet-dev] Datastore formats and scalability

Reply via email to