[Freenet-dev] Music client

Michael ROGERS Fri, 28 Jul 2000 11:01:22 +0100

[Freenet developers - if you'd like us to move this discussion off-list, just
say the word]


I've been thinking about the implementation of a music-sharing client. Here 
are some thoughts.

----------------------------------------------------------------------------

Songs are encrypted with a random RC4 key, then hashed. The hash is used as a
key to insert the file into freenet (CHK). The hash and the RC4 key are
concatenated to form the "read key". You need this to retrieve the file and
decrypt it. The CHK is not enough to decrypt the file, so the nodes storing
and handling the file *cannot* know what they're handling.

Note that encrypting the file before hashing stops the redundancy-removing
property of CHKs from working; multiple identical files can exist under 
different CHKs. This wastes space, but avoids this possible attack:

        You get a copy of the file you wish to censor and hash it to get its 
        CHK. Now you request a document with a *very similar* CHK. This will 
        probably be stored on the same node as the file you're looking for, 
        but because you don't request the file itself, you don't spread it 
        around the network. You make a note of the node which is reported to 
        be providing the file you requested. (This may not be the node which 
        is actually providing it.) Repeat with several CHKs similar to the 
        CHK of the document you're trying to censor. Because freenet adjusts 
        its topology to make direct connections between nodes which share a 
        lot of traffic, you will sooner or later be able to work out which 
        node is really providing the documents you're requesting. Then you 
        attack that node by out-of-band means (ping -f or court order).

Encrypting the file before hashing it makes it possible to store multiple 
copies, so if one is censored you can easily insert another (and it will be
stored on a different node). It also makes it impossible to guess which CHK 
the file is stored under, but that's not much of an advantage because you 
have to reveal the CHK at some point so the users can download the file.  :)


PROBLEM 1: MULTIPLE ENCODINGS, ONE NAME

The first problem is that we need to be able to retrieve files by giving 
*only* the song title and artist's name. The suggestion of requiring a
version number / encoding number to be provided doesn't work - the user would 
have to guess the version number. If there are few enough version numbers per 
song to make them guessable, they can all be squatted. If there are enough to 
make squatting them all impractical, guessing is also impractical. If the 
user has to get the version number by out-of-band means, he might as well 
just get the CHK of the file and we can forget about names altogether.

So to prevent key squatting and allow searching by name, it needs to be 
possible to store any number of files and use a single string to retrieve 
them. The users can then decide what's a valid encoding and what's noise (or
misnamed).

This means we need a directory for each artist+song string, stored under the
string's hash (KHK). The directory contains the CHKs of the actual files 
(stored separately). A fixed array of CHKs is not enough - a malicious user 
could quickly squat all 256 keys for a given artist+song string, for example. 
What we need is a dynamic list of CHKs accessed via a single KHK.

It must be possible to add entries to a directory when new encodings of a song
are stored. It must be possible to get the list of entries. Optionally, it
should be possible for the node holding the directory to perform housekeeping
tasks such as removing redundant entries.


PROBLEM 2: DIRECTORIES ARE A SINGLE POINT OF FAILURE

The directory is a single point of failure for the string it represents. If
the node holding the directory is malicious, it can prevent access to all
encodings of the song. This is somewhat mitigated by the fact that the node
doesn't know which song it is preventing access to, because it only knows the
hash of the artist+title string. Nevertheless, a malicious node could prevent
access to some (random) song no matter how many times it was stored, causing
great annoyance to the users. Also, directories will be vulnerable to the
close key attack outlined above. I don't see how this can be avoided - there
must be a single directory for each song, so there is a single point of
failure. This is the biggest problem with the scheme that I can see.


PROBLEM 3: WHEAT AND CHAFF

Let's assume that those who oppose the free movement of information aren't
stupid. Realising that freenet can't be shut down by a court order, sooner or
later they will use technological means to try and close it down or make it 
unusable. Obvious attacks include running malicious nodes, running malicious 
clients, discovering nodes and attacking them by out-of-band means, and (to
prevent music sharing) submitting dummy encodings of songs which either squat 
keys or waste users' bandwidth.

The first three problems have to be dealt with by freenet's design. The 
problem of dummy encodings has to be dealt with at the user level - only the 
users can separate the wheat from the chaff.

Freenet is designed so that files which are requested a lot spread around the
network; files which are never requested eventually disappear. To exploit this
mechanism, we need to be able to check the quality of an encoding without
downloading it. We need a way for users who have previously downloaded the
encoding to tell us whether it's worth downloading.

My solution is Slashdot-style moderation. This style of moderation is fairly
robust - at least, it does not allow vote-stuffing. You can only moderate 
when you are given moderation points, which happens randomly and 
infrequently. For Freenet it would work like this:

        Moderating a song:

        Each time you downloaded a song (by getting a CHK from a directory), 
        the node storing the song would, with a small probability, hand you 
        a moderation token (a random number).

        Your client would remember that you had been given a moderation token 
        for that file. Next time you connected to freenet (to give you time 
        to listen to the song), it would ask you to moderate the file: either 
        +1 for a good encoding, -1 for a bad one, or 0 for don't know / don't 
        care. You would also get a text box to enter a short comment on the 
        song.

        Your score, your comment and the file's CHK would be encrypted with 
        the file's RC4 key (you got that from the directory, remember?) and 
        sent back to the node which supplied the file (addressed using the 
        file's CHK, so you don't need a direct connection to the node), with 
        a plaintext version of the moderation token attached to allow the 
        node to check that it really asked for your opinion.

        The text comment prevents known plaintext attacks and also gives the 
        users a warm fuzzy feeling of community.  :)

        The node which supplied the file doesn't know the RC4 key, so it 
        can't find out how you moderated the file and it can't change your 
        decision. This prevents malicious nodes from, for example, reversing 
        all moderation done to a file, or applying moderation decisions for 
        one file to a different file. The worst the node can do is discard 
        your decision, leaving the file unmoderated (in which case it won't 
        get requested often, and another node's copy of the song will be 
        downloaded instead).

        Looking up a moderated song:

        When you look up a song's directory, you get the CHKs for a number 
        of encodings. Instead of requesting one of them straight away, you 
        send a message to the node holding one of the files, asking for the 
        moderation results for that file (this message is addressed using the 
        file's CHK). The node returns a stream of encrypted moderation 
        comments which it can't read. But you can, since you got the RC4 key 
        from the directory. You can verify that the comments apply to that
        file because they contain the file's CHK. Your client totals up the 
        file's score, shows you the users' comments, and asks you if you want 
        to download the file. If you don't want to, your client gets another 
        CHK from the directory and you repeat the process until you find a 
        good encoding (or decide from the comments that the song sucks, and 
        give up).


PROBLEM 4: WE'RE NOT IN FREENET ANY MORE

If this is supposed to be a quick hack to keep Napster fans happy, it won't
work. My design requires the following extensions to nodes:

        They must be able to route messages to the node storing a given CHK. 
        This possibly opens up the network to DoS attacks (?). This is 
        required for moderation.

        They must be able to route messages to the node storing a given KHK as 
        well, to allow entries to be added to directories. Again, DoS.

        They must understand the message "add this read key to the directory 
        with this KHK".

        They should also perform some directory management tasks during idle 
        moments:

                * Check that a new entry really exists by requesting the file 
                  it points to.

                * Retrieve two files, decrypt them and compare them. If they 
                  are the same, remove one of the directory entries.

-----------------------------------------------------------------------------

Any thoughts?


Michael

_______________________________________________
Freenet-dev mailing list
Freenet-dev at lists.sourceforge.net
http://lists.sourceforge.net/mailman/listinfo/freenet-dev

[Freenet-dev] Music client

Reply via email to