[Tech] Proposal: Move everything non-core out of fred

Guido Winkelmann Mon, 3 Oct 2005 14:00:52 +0200

Hi

Am Montag, 19. September 2005 17:07 schrieb Matthew Toseland:
> On Sun, Sep 18, 2005 at 04:12:19PM +0200, Guido Winkelmann wrote:
> > Hi,
[...]
> > If 0.7 should be done this way, the mentioned external daemon should, of
> > course, come bundled with the standard Freenet package. A stripped down
> > fred on its own would be of no use to most, just like a TCP/IP stack on
> > its own - or i2p...
>
> And it would have to run in the same JVM, for performance and user
> friendliness reasons (an extra several tens of megs for a second VM is
> *BAD*). Why is it beneficial?


It doesn't have to run at all, and if it runs, it doesn't have to run on the 
same computer. Given the extremely high computing resource needs for Freenet, 
I think this is beneficial.

> > Things that should be thrown out that are present in current versions
> > (5xxx) include:
> >
> > - The Web interface (definitely - but it might be wise to leave a
> > stripped down version for strictly node-maintainance related purposes.)
>
> If you're going to chuck it out, then chuck it out. We do have FCP,
> after all.

I'm sorry, I'm not a native english speaker. I haven't quite understood what 
you were trying to tell me there.

> > - The Distribution Servlet (definitely)
> > - Everything related to splitfile handling (maybe, more talk on that
> > further down)
>
> Yuck. That would mean that we have to have two separate FCP protocols -
> high level FCP and low level FCP.

No, not necessarily. If you move splitfile handling out of the node, you 
cannot supply a high-level FCP at all. If you leave it in, you could support 
both a low-level- and a high level FCP, but there's no good reason for the 
low-level one then. So, however you do it, you'll need only one FCP (/maybe/ 
a second one for streams, but that's an entirely different discussion).

> IMHO we must not require every single 
> client author to implement binary metadata, compressed metadata,
> hierarchical metadata, FEC codecs, and everything else required; we must 
> have a "high level FCP" including reasonably transparent splitfile
> support,

No, not "reasonably" transparent. Fully transparent support or none at all - 
loading a 500 MiB file should be no different (to the programmer of a 
third-party app) than loading a 4 KiB file. (well, except for resource 
consumption...) But that's a different discussion again.

> and it should probably include download direct to disk, 
> download when the client is not connected, and queueing, as well, for
> reasons related to starting splitfile downloads from fproxy.

I don't think that's a good idea. Better have a properly done download manager 
outside the node than a half-assed one inside it.

> > - The command line client (I don't think there's a good reason to include
> > that in the normal freenet.jar)
>
> It's tiny, and it's a useful debugging tool. There is no compelling
> reason to include it in the jar except to save 10kB or whatever it is.
> Which is insufficient reason given Fred's memory footprint!

Well, this is not an important part of the whole discussion anyway, but I 
still think it should get it's own .jar. Having to call it by its class name 
from the freenet.jar may be intuitive for Java programmers, but not to anyone 
else. This isn't how how things usually work on the commandline, usually you 
have one utility = one binary. The way it is now, most people won't even know 
there is a supplied cl client.

> > and maybe other stuff I don't know about.
> >
> > The advantages to this approach are:
> >
> > - The node source code becomes smaller and easier to maintain.
>
> It does? How so? We'd have to maintain two separate source modules!

Yes, but it won't have to be the same people maintaining the seperate deamon 
source who currently maintain fred's. I can, of course, not guaranty that 
there will be people who will want to do this, but this way, your chances of 
ever being able to delegate a significant portion of the work off to others 
will increase dramatically.

> > - The operation of the node becomes more reliable. (There is less stuff
> > in there that might cause it to crash.)
>
> The node must be cleanly separated from the UI code, including e.g.
> splitfile assembly. This happens via interfaces, not via running it in a
> separate VM, and not via having it in a separate CVS module.

It happens by running it in a different process or even on a different 
computer.

> > - Resource consumption of the node will be lower and much more
> > predictable.
>
> Only if it runs without the extra daemon. Which it won't for the
> majority.

You're making a big assumption here. Who says everyone will really want to use 
the services you provide by default with Freenet? Even right now there are 
probably lots of users who only use fproxy for status info about the node and 
mostly run their nodes only for frost or pm4pigs or something like that. For 
these users, fproxy is mostly bloat. (I am one of those users.)

Freenet is, in broad terms, a new anonymizing data transport layer. As such, 
it can be used broad range of applications, most of which none of us ever 
even thought about. "Freesite-browsing" might become the least important 
application before you know it.

> The memory overhead for running two JVMs instead of one will be 
> significant,

As I said:
- It doesn't always have to be running
- It can run on a different computer

> unless the JVM automatically coalesces, in which case there 
> is no point anyway.
> > - The code for enduser related functionality becomes a lot more
> > accessible for new programmers. (Patches to fad are pretty much
> > guaranteed not to interfere with the node's core operations and, thus,
> > are more easily accepted.)
>
> Nobody (in the OSS world) codes java. And of those who do, they don't
> use Fred's existing interfaces. This is a matter of documentation and
> communication and stubbornness, not a matter of what is in what jar.

There definitely are people in the OSS world who code Java. Look at Azureus 
for example or at I2P. It's also not simply a stubbornness-thing. Look at 
other large OS-projects like Gnome, KDE, mldonkey[1] and whatnot. These are 
thriving. Freenet isn't. I think this is a problem and I think it should be 
addressed and I think that splitting user-related functionality and core 
functionality is an important step in that direction. (Documentation is 
probably even more important.)
If a new programmer decides he wants to help out, the first thing he'll have 
to do is find out how things work in the already-existing code, where he'll 
have to tinker with it to achieve certain goals and where his own code might 
fit into the whole thing. The bigger and more complex the existing codebase 
is, the harder and more frustrating this process will get. Right now, if a 
new programmer who has some ground-breaking ideas for fproxy for example, he 
will have to wade through tons of code having to do with things like 
datastore management, routing table management, key processing and whatnot.

If the code were in a seperate program, this would be a lot easier. The 
general outline of the program, "the way of doing things" in there could be 
simpler, more streamlined and better adapted to the needs of an end-user 
tool.

Then there is also the issue that people working on enduser related code might 
easily break some of the node's core functionality if it's all one big 
program. Someone from the project (I think it was Ian Clarke) said the 
project had learnt the hard way that lots of times parts of the code that 
shouldn't affect the performance of the node, did. A consequence of that, 
IMHO, should be "well then don't put so much code in there that might go 
wrong.". "Keep It Simple, Stupid", they say. 
(Note how many of the internet technologies that enjoy real long-term 
success, like IP, TCP, UDP, SMTP, NNTP, rfc822-messages, HTTP... did just 
that.)

> > - The external daemon can be written in a language other than Java
> > (Whether that's an advantage may depend on your POV. IMO, it's a big
> > one.)
>
> Not if it is bundled with Fred it can't. The high level daemon must be
> maintained by the project itself, meaning it has to be in java.

Why? Have you sworn a holy oath to Java or something?

> And, 
> without intending to start a language flamewar here, you do know that
> java can be compiled, right?

I have basic knowledge of Java. It's mandatory at the uni. Still, I don't see 
what this has to do with anything

> > - Third party implementations of the node itself can be done more easily.
> > (The programmers won't have to rewrite all of the most basic user
> > interface stuff from scratch) (I think this is a pretty important issue
> > in long run.)
>
> Why does it matter? As long as there are clear internal interfaces it is
> quite possible to plug in an alternative node implementation. And given
> the SUBSTANTIAL effort involved in cloning either the node or the high
> level code, this is not a big deal.

Cloning a Unix-like operating system takes substantial effort, too. Yet, 
people have done it, linux is proof of that. The availability of userspace 
tools which are more or less independent from the kernel they're running on 
is an immeasurable help in such a case. Had the GNU tools been tightly 
integrated into the kernel they were designed for back in 1991, Linux would 
not have come very far.

> > - We can draw a very clear line between what's enduser stuff and what
> > isn't. (And the node should not be enduser stuff. Install it, configure
> > resource restrictions, start it, stop it should be the sum of the cases
> > where the (opennet-) enduser has to interface with it.)
>
> Configure resource restrictions? How exactly? Via command line? Via
> telnet to port 8481?

There is a large number of possibilties for this. It'll have to be done one 
way or the other anyway.

> By definition all user friendly code must be left out of the node!

Not quite. What I am saying is that the user interface-related code in the 
node should be as little as possible but as much as necessary. Some stuff 
just needs to be done at that at level - resource restrictions (i.e. limits 
for bandwidth and diskspace usage), peer management in the darknet case, 
keeping the node up-to-date, stuff like this. This is all just 
node-administration/maintainence stuff, i.e. none of the "essential" 
functionality. (Essential functionality being the sort of functionality that 
can be the reason for people to run a node in the first place, i.e. things 
like freesite-surfing, using some sort of message-board or mail software via 
freenet or something like that.)

This might be done via a stripped down version of fproxy or, better yet, via 
some external utility that simply edits the config file(s) and sends the node 
some sort of signal when it's done, to make it reread it's configuration.

> > - Fred can finally become a lean and mean unbloated routing machine.
>
> Which is totally meaningless if it requires a bloated user daemon for it
> to do anything useful.

Do you require the Mosaic browser to do anything useful at all with HTTP, or 
even TCP/IP? You might use it, but there are plenty of other useful options.

> > Disadvantages:
> >
> > - The project needs to make extra sure that "fad" really is in a usable
> > state before it can release 0.7
> >

[... Discussion about whether to do splitfile handling in the node or in the 
clients ...]

There might be a third option: Have a third, independent agent somewhere which 
does nothing except splitfile handling. This has the following advantages:
- The node can really be reduced to (32 KiB-) key and stream 
handling/forwarding and datastore management.
- The splitfile handling agent can be extremely simple: It does nothing except 
splitfile encoding/decoding.
- The splitfile handling agent can be run on a different machine.
- The splitfile handling agent needs only be run when the node is actually in 
use. If the user isn't using his node for a while, it can fully concentrate 
on handling/forwarding requests from it's network neighbors.
- Third party apps don't can still be as simple as they were if splitfile were 
done inside the node.

Disadvantages:
- This would mean we really need to have a low-level FCP and a high-level FCP.

I see two ways of integrating this external agent with Freenet:

- Put it between the node and the client apps
This way, clients would talk directly to the splitfile handling agent using a 
high-level FCP while the agent would use the node as its backend, talking to 
it using a low-level FCP.

- Have the node delegate splitfile handling to the agent on it's own accord.
This way, clients would still talk directly to the node. If the node receives 
a request for a key, it simply forwards the request to the agent and waits. 
The agent will then request that key from the node using low-level FCP (which 
says "give me the raw 32 KiB block behind this key" instead of "give me the 
file referenced by this key") and decide whether it's a splitfile or not. I 
it isn't, it simply truncates the block to the exact file-length and gives it 
back as the final result. If it is, the agent requests the remaining blocks, 
decodes the file and hands it back.

The last approach has the advantage that it could still be implemented much 
later without stirring up too much mud.

> > Thus far for my proposal. Comments would be appreciated.
> >
> >     Guido

        Guido

[1] mldonkey is interesting for yet another reason: They're using a language 
that's even less popular than Java, and they're still doing better than 
Freenet.

[Tech] Proposal: Move everything non-core out of fred

Reply via email to