>
>              FROM: finney.org
>              DATE: 04/19/2000 08:37:42
>              SUBJECT: RE:  [Freenet-dev] Proposal for the Near Future 
> (Searching, CHKs and encryption, ..., oh my!)
>
>              Michael Wiktowy <spam at mindless.com> wrote:
>              > the general request (aka search):
>
>              We are using the term "searching" somewhat ambiguously.  As you 
> use it
>              here you are basically referring to key guessing, and routing 
> using the
>              normal Freenet routing algorithm.
>
>              I think "searching" has more generally referred to some mechanism
>              to let you find data when you can`t guess the key.  You could 
> try to
>              make some kind of broadcast work, as Brandon and others have 
> proposed.
>              Ian suggests doing a fuzzy match on the keys.  Others have 
> proposed
>              putting the data in (as index/metadata entries) multiple times 
> under
>              many different relevant keywords so that the guess-and-route 
> algorithm
>              has a better chance of finding the data.
>
>              To be clear, your proposal doesn`t really try to solve the 
> searching
>              problem, but it could seemingly be used in conjunction with 
> these other
>              possible approaches.

You are absolutely right here ... my proposal does not include a specific 
search method other than just returning the metadata associated with a 
particular KHK ... which may match many entries. I
am sort of tieing it in with my thoughts that people can insert references to 
their data under keyword KHKs but it is certainly not limited to that. I am 
somewhat supportive of a concept of fuzzy
KHK matching and have a concept that I am mulling over in my mind before 
regurgitating it here. This infrustructure should work nicely with any KHK 
matching algorithms though ... and that is the
level at which I think you are defining searches ... rather than the return 
mechanism for the multiple matches that I am outlining.

You can narrow the definition of a search down as much as you like but I had 
just used the term to differentiate it from a targeted request.
search = get a group of metadata from which a targeted request can be made
targeted request = retrieve the one, and only one, match (metadata and data) 
for your selection

>             FROM: finney.org
>              DATE: 04/19/2000 08:22:13
>
>              Mike writes:
>              > The cpu load of all these CHK generations is the only thing 
> that may be
>              > a problem if the CH algorithm is too complex. However, it 
> doesn`t need
>              > to be overly complex since more uniqueness will be provided by 
> hashing
>              > it wil the KHK. Also, generating the hash once on insert and 
> then having
>              > sentry nodes verify the hash should ease some of the load.
>
>              One thing I didn`t understand in your description was whether 
> you need to
>              index under Hash(CHK:KHK).  Why not just store CHK and KHK with 
> each entry
>              in the node`s data store.  Then when you get a KHK it is quick 
> and easy to
>              go down each entry and check for a match, you don`t have to 
> compute a hash
>              for each entry.
>

You are correct in that you can store just the KHK and the CHK in the index. 
Why I am proposing storing H(KHK:CHK) and CHK is simply you are only allowed to 
have one H(KHK:CHK) in your stores (that
is the collision check) while you can have many matching KHKs. If you are going 
to be checking for collisions at this level is seems faster to have that as a 
staticly stored value rather than a
dynamically calculated one.

Also, the CHK is not a unique value to index with either since you would like 
to have the freenet system capable of storing multiple copies of something 
under multiple KHKs. Also, there is no
guarantee that, for data that is bigger than the CHK, that two different data 
files won't have the same CHK. It may be unlikely but it believe that 
H(KHK:CHK) would give another level of
uniqueness.

I think our current storage and key matching scheme depend on a unique index in 
the store and I was just trying to fit into the existing scheme. This allows a 
lot of reuse of the current code with
just some new methods being introduced to handle multiple KHK matches.

I suppose you need not hash the KHK:CHK combination at all but simple 
concatenate them. However, I figured that obscuring the KHK in the store would 
make it harder to tamper with (and the CHK is
tamper-resistant already) ... if that is a worry.

If people start to store things under keywords then it would be easy to find 
the hashes of some illicit keywords (mp3 for example) and force people to 
delete references in their stores under the
KHK of "mp3". The metadata header being part of the CHK and the CHK being part 
of the H(KHK:CHK) makes this pretty hard censorship to perform.

One thing that would need to me sorted out though would be encrypting the 
metadata as well. It could be encrypted along with the regular data (or encrypt 
the regular data, tack on the metadata and
encrypt a second time) but the node would need to know which encryption method 
to use (store-side metadata rather than data-side metadata).

Mike



_______________________________________________
Freenet-dev mailing list
Freenet-dev at lists.sourceforge.net
http://lists.sourceforge.net/mailman/listinfo/freenet-dev

Reply via email to