Re: [freenet-dev] [gsoc2013] UPDATE WEEK#5

Matthew Toseland Thu, 18 Jul 2013 03:16:11 -0700

On Thursday 18 Jul 2013 10:54:49 leuchtkaefer wrote:
> 
> I include some documentation on what I am working:
> 
> Notes on changing Library API /WoT
> 
> Requirements:
> =============
> whatever you do to refactor Library will still need to support Spider
> 
> Current main problem: 
> =====================
> Library can only handle, one tree at once (and it generates the USK for that 
> itself). 
> 
> We need:
> =======
> -Generalise both existing messages in Library API
> -Modify the frequency of properties update in WoT (see things done by WoT)
> -an FCP command to fetch the index so we can show it on the Spider UI, NOT 
> SURE see with toad.
> -Mid to Long term: To search USK of other indexes (from other identities)


Right, one thing at a time.
> 
> Things done by Curator 
> ======================
> Curator gets all identities of the local node from WoT plugin. 
> It offers the client to select the identity that will share indexes. Each 
> identity only has one index (do we need more?). Each identity may have a list 
> of topics to curate (terms). 
> Curator derives the index' key from the identity. The index will use as 
> prefix the Request URI and change the file's name part (e.g. "index").
> It passes to Library new entries to populate indexes. Each entry corresponds 
> to an on-freenet webpage or a on-freenet document. Eventually it may read 
> info automatically from the webpage to share (page description, usk, etc). In 
> the case of documents, the user need to complete the form manually. Each 
> entry is in a buffer, then we send pushBuffer message to Library. 
> It provides Library with the USK for the index.

Right. So:
1. Generalise Library FCP API to support multiple indexes. Make Spider work 
with the new API.
2. Make pushBuffer able to force an upload to Freenet.
3. Add support for a new kind of entry to Library: TermFileEntry (compare to 
e.g. TermPageEntry). This represents a file for filesharing purposes. It 
contains the URI, MIME type, possibly a title, hashes, etc. We may want to sign 
it. Create some sort of basic initial UI to add files to the index, via the 
Library API.

We will want to search by keyword ("term" in Library). So there is a 
fundamental decision to make here: Do we want to duplicate the TermFileEntry 
(which could be fairly large, maybe 200 bytes?), under each term/keyword? The 
simplest answer is yes, although there are some costs in the amount of data we 
have to upload... the complex answer is no, create a second tree. IMHO the 
right answer for now is probably to have a single tree.

4. Search support for specified USKs (possibly in Library?)
5. Search support for all USKs visible in the local WoT (where?)
6. Optimisations, e.g. pre-fetch the top parts of the tree, post bloom filters 
of terms in the index etc.
> 
> Things done by Library
> ======================
> Maintains on disk different indexes. Basically mergeToDisk() parses the data 
> (which is a SimpleFieldSet) and adds it to the on-disk tree 
> Currently it only handles one for Spider. Library generates the USK 
> (randomly) and uploads it to Freenet.
> At the moment, Library API can handle two messages designed for Spider:
> 1)getSpiderURI
> getSpiderURI calls getPublicUSKURI(): handle -> handleGetSpiderURI() -> 
> getPublicUSKURI()
> getPublicUSKURI() is a method on SpiderIndexUploader.java
> It is the method in SpiderIndexUploader. Is is useful to know where is the 
> tree uploaded on Freenet (currently a random USK).
> Of course the tree itself is uploaded to a CHK - uploading a pointer to a USK 
> is something on top
> 2)pushBuffer
> The pushBuffer FCP command takes a Bucket of data, which is a series of pages 
> ... eventually mergeToDisk() gets called (the outer layers are performance 
> hacks/buffering)

Yep. And eventually when the on-disk index gets big enough we call through to 
mergeToFreenet.

IMHO for Curator we will want a call to explicitly start an upload to Freenet 
regardless of the size of the on-disk index.
> 
> Things done by WoT (mail p0s)
> 
> ==================
> We need to update the USK edition number periodically. We need update because 
> USK is not so efficient to get the last version soon. It may take some time 
> to get it. 
> Updating an identity's property causes an identity re-insert (i.e. upload the 
> identity to freenet). We need WoT to update the IndexRoot property frequently 
> but not every time we update indexes. We don't want to cause so many 
> re-inserts. So the optimisation is that our property gets updated only once, 
> in the daily identity re-insertion triggered by WoT. p0s can change this 
> behaviour.

Right.
> 
> Leuchtkaefer

signature.asc
Description: This is a digitally signed message part.

_______________________________________________
Devl mailing list
[email protected]
https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl

Re: [freenet-dev] [gsoc2013] UPDATE WEEK#5

Reply via email to