Re: [freenet-dev] [gsoc2013] UPDATE WEEK#5

leuchtkaefer Mon, 26 Aug 2013 14:14:16 -0700

> 

> Right. So:
> 1. Generalise Library FCP API to support multiple indexes. Make Spider work 
> with 
> the new API.



How can I test spider? Will it consume all my laptop resources? Anyway I am not 
touching Spider code, so it should work independently from curator.

> 2. Make pushBuffer able to force an upload to Freenet.

This is related to my previous email, asking for more documentation. I am not 
sure if the way index' entries are processed for Spider is the best way for 
curator. I need documentation on the SpiderIndexUploader class

> 3. Add support for a new kind of entry to Library: TermFileEntry (compare to 
> e.g. TermPageEntry). This represents a file for filesharing purposes. It 
> contains the URI, MIME type, possibly a title, hashes, etc. We may want to 
> sign 
> it. Create some sort of basic initial UI to add files to the index, via the 
> Library API.
> 
Will be TermFileEntry and TermPageEntry on the same index? What do you mean by 
sign it?


> We will want to search by keyword ("term" in Library). So there is a 
> fundamental decision to make here: Do we want to duplicate the TermFileEntry 
> (which could be fairly large, maybe 200 bytes?), under each term/keyword? The 
> simplest answer is yes, although there are some costs in the amount of data 
> we 
> have to upload... the complex answer is no, create a second tree. IMHO the 
> right 
> answer for now is probably to have a single tree.

Well....I did the simple solution as we previously have discussed. Result is 
negative, it takes ages to upload all the data. Maybe adding a tree would be a 
better option, but I cannot imagine how to solve it yet. If I least I could 
understand better the above mentioned class I may rearrange things to improve 
performance.

> 4. Search support for specified USKs (possibly in Library?)
Don't understand
> 5. Search support for all USKs visible in the local WoT (where?)
Again, what do you mean by search support for specified USK? If I know the USK 
I just type it. Not sure what you mean.

> 6. Optimisations, e.g. pre-fetch the top parts of the tree, post bloom 
> filters 
> of terms in the index etc.

I imagine this is for future, priorities are performance on index upload and 
search content.

> 
> Yep. And eventually when the on-disk index gets big enough we call through to 
> mergeToFreenet.

Do you want to call mergeToFreenet only when the on-disk index is big? So 
entries are not uploaded inmediately? 
Maybe that is the current problem, each entry is merged on the on-disk tree and 
then is uploaded. It takes 2 hours to upload only one new entry that has about 
7 tpe. 


L
_______________________________________________
Devl mailing list
[email protected]
https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl

Re: [freenet-dev] [gsoc2013] UPDATE WEEK#5

Reply via email to