Re: [Tech] freenet not suited for sharing large data

Gabriel K Mon, 04 Aug 2003 12:26:20 -0700

Hello again :)

----- Original Message ----- 
From: "Tom Kaitchuck" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Monday, August 04, 2003 8:23 PM
Subject: Re: [Tech] freenet not suited for sharing large data



> On Monday 04 August 2003 04:20 am, Gabriel K wrote:
> > > A. Just broadcast your request to everyone.
> > > Doesn't scale.
> >
> > In DC you broadcast your request to everyone. But a better approach is
to
> > organize the search like chord does I think.
> > Anyway, let's say it would be possible for a node to send a request to
> > another neighbour, which passes it on, and in time the request forks, so
> > that the requester doesn't have to wait for the message to pass ALL
nodes.
> > And even though it forks, and the same request is asked simultainiously,
> > there are NO looping problems. A node only recieves the messages once.
> > I would say that scales, would you?
>
> No. No it does not. Even if you could some how make an optimal network
> (impossible unless you serialize everything), where you only asked the
> minimum number of people in order to find the data, and there were never
any
> loopbacks. You still have to ask (on average) n*m nodes for every request.
> Where N is the number of nodes, and M is the fraction of the nodes that
have
> the data you want. Also everytime that the data is not available, you have
to
> ask everyone. This sucks royally.

Hmm don't quite follow your logic here... why n*m? On the contrary, you only
ask n nodes. And also, it is possible to send the message ONCE to another
node, and then it will propagate through the entire network, and each node
will recieve the message only once (and forward it). The propagation is not
serial through all nodes, it has some parallellism depending on the how many
nodes you want each individual node to know about.
This can be achieved by ordering the nodes in rings, connected to eachother.
I try to sketch my idea with illustrations and text at
http://www.student.nada.kth.se/~m98_khl/O-net.pdf, feel free to have a look!

> > > B. Have a centrial server to keep track of it for you.
> > > Centralized, and vulnerable to attack.
> >
> > Modification: you can have a central server making decisions for the
> > network and individual nodes, and STILL it can be safe from attack IF no
> > node knows its IP adress!
> > It is VERY practical to have such an authority that every nodes trusts
and
> > obeys. If they trust ONE instance, this makes it a lot easier to solve
many
> > problems, because you avoid scaleability problems with voting processes
> > (many times, if ONE any node can decide on it's own, that power can be
> > abused) .
>
> 1. You can't hide a server that people have to connect to.

True.

> 2. If you could, would you trust a server that you did not know who
> controlled?

Well, the central server can only map who transfers a file to whom, but
nothing else.
Also, I think networks would form whos founders are already known as
fighters for "liberty of speech", or warez traders or whatever..

> 3. The servers still need to know where the data is. Are they supposed to
> trust the individual clients are being honest about what they have and
> whether they sent it? If the data is going through the severer, it would
be
> VERY slow.

I agree fully.
However I must say I still don't understand that way of thinking about nodes
that join the network and then don't share their data.. I don't see that as
a dangerous attack. But in the idea described in the pdf above, that's an
easy fix. Because you have the central point that every node trusts and obey
it is easy.

> 4. It can still be attacked by crackers, lawyers, and network failures.

Not if they don't know the IP, and they don't. Besides, you can have several
central points, so if one is attacked, the other can take over. Hmm this is
also explained in the pdf...

> > > C. Use some sort of predefined routing scheme, where it is determined
in
> > > advance which node holds which data.
> > > The holder any given piece of data can easily be determined.
> >
> > This COULD be solved by letting the "hidden" central server take care of
> > such things, but it is better to put as much work as possible on nodes,
to
> > relieve the importance and load on the central point (if there is one).
> > This is also why I think it's good to leave the data at the nodes that
> > holds them from the beginning. No sorting into the network, and no extra
> > store space on each node for someone elses files. You only are
responsible
> > for the stuff you WANT yourself.
>
> If you don't store things the network speed is bounded by the person
sharing
> the data.

Well, you have a point there. The load is well distributed in freeNet. But
so it is in bitTorrent if you ask me. And it's not that hard to achive this
load balancing. I think that should be a thing for my idea as well.

> > Umm, you are talking about that the HOLDER of the data is malicious and
> > doesn't send the data? I would say there is no defense against that
attack.
>
> YES THERE IS! Don't route data there next time.

heh, well that argument is only relevant to freeNet, where you upload the
data first.
In "my" model you don't upload the data first. So an "evil" node doesn't
hold anyone elses data, only it's own. So it can't do any harm that way.

> If you can't do this, any node
> can just say it has every piece of data on the network, and it always
returns
> requests very quickly. Pretty soon it's killing half the network's
traffic.
> Plus a single malicious node could contact every other node and request it
to
> be a proxy for it. Then it can pretend to be a ridicules number of nodes,
and
> essentially bring down the whole network.

Hmm.. well have a look at the pdf above. That stuff can't happen in "my"
model.
Well ok, the first attack you mentioned can happen, but that node would
pretty soon be reported to the central point and kicked out.
And in my model you cannot volenteer to be a proxy, and you know very little
of the IP addresses in the net.


> > Well, you are assuming users are active spread equal throughout the
day...
>
> This is a fair assumption, because we are routing biased on the hash of
the
> data!

I'm sorry I don't understand that reply.. come again? :)

> > I say that it is likely that networks will form where most users are
from a
> > specific region in the worlds, for instance scandinavia. So their
activity
> > will have it's peak at some time. And at this time I think it's not hard
to
> > reach the limit!
>
> It should be very hard to reach the limit. The nodes where the data is
comming
> from are randomly distributed. The intermidiate nodes are fairly randomly
> distributed, and if there are more original requests comming from a
confined
> aria, the load balancing should shift it so more of the intermediate hops
are
> routed elsewhere to compensate.

Hmm maybe I have missunderstood something about freenet.. I always assumed
there are *many* freeNet networks..
Is there just one? I thought it's meant like DC, that anyone can start a
little network.

> > Anyway, you are missing the point here... Sure, they might not reach
this
> > limit, BUT the PROTOCOL should not require so much BW! Maybe a user
doesn't
> > only want to use freeNet. Maybe he runs DirectConnect as well while
doing
> > other stuff that requires some BW...
>
> > Not in freeNet... and that's why I think it's not suited for large file
> > transfers.. at least not with the level of activity I think one should
> > assume when writing a protocol.
>
> What is better? Any individual can max out their download speed, even if
the
> file is unpopular. You are making the flawed assumption that inserts are
very
> common, and nearly as frequent as requests. Where in reality, requests
> probably outway inserts 1000 to 1! Think about any webpage, how many times
is
> it viewed compaired to how many times it is updated? Now in Freenet the
same
> applies to any file. It is not like other networks where once you download
> something, it becomes "shared". Content only needs to be inserted into the
> network ONCE. (Two people CAN'T insert the same conetent.)

Maybe my assumtion is flawed.. Maybe the contents of a file sharing network
is pretty static.. maybe you are right here yes..

> Now lets think about what you are proposing. If you have a centralized
system
> in charge of routing, you want it to be some what distributed, and you
want
> end-to-end encryption, and you want none of the centrial servers to know
any
> thing incriminating. So take MixMinion. Then you want to run files over
it.
> But you want to be anonymous. So you allow any client to connect to
another
> client as a proxy. Then when data is found the proxies will transfer the
data
> to one another. Then you start adding optimizations. Like it would be a
good
> idea for the proxies to cache the content for the next request. Also it
would
> help if the proxies learned a little about their immediately surrounding
> network, so they wouldn't have to go through the main server, if they
could
> find the file locally. For security, you want to encrypt the files. But it
> would be better if they were broken up so you could do bitTorrent style
> downloading from the network. Then to make the servers job easier, you
want
> to categorize the proxies biased on content. (IE: you connect to the other
> proxies that are as close to what you want to share as possible.) That
scheme
> doesn't have to be prefect, just the best match of the nodes you already
> know. Then, as I explained before, to prevent the network form being
> attacked, you have to return the data along the request path, but you can
cut
> a few steps out. Then both for security, and to speed up the network, it
> would be best for you to off load all your shared files directly to the
> proxies, and dedicate all your storage space to your aria of
specialization
> as determined by the nodes that are using you as a proxy.
> Congratulations! You have recreated Freenet! Only difference is there is a
big
> server in there. If you want Freenet with a big server there is a shorter
> way:
> 1. Find a very fast server with HUGE bandwidth.
> 2. Install Freenet on it.

I think you mixed some things up that I said.. I don't want a central server
in charge of routing. I only want it to take care of things that would be
dangerous for the network as a whole if let to a single node to decide. Only
that.
Well I think if you read the pdf, that will explain many ideas I have been
rambling about.. maybe things will be easier to discuss after :)
Let me know when you have read it, or if you just don't want to hehe.

Btw, thanks for the conversation, I find it very interesting!

/Gabriel

_______________________________________________
Tech mailing list
[EMAIL PROTECTED]
http://hawk.freenetproject.org:8080/cgi-bin/mailman/listinfo/tech

Re: [Tech] freenet not suited for sharing large data

Reply via email to