On Sun, Jun 04, 2006 at 07:01:09PM +0200, Jerome Flesch wrote:
> Le Samedi 3 Juin 2006 03:16, Matthew Toseland a ??crit??:
> > On Sat, Jun 03, 2006 at 03:00:49AM +0200, Jerome Flesch wrote:
> > > > > > The main changes I would make to the librarian
> > > > > > format right now would be:
> > > > > > - Support splitting. (This is relevant to file indexes)
> > >
> > > I updated my format proposal on
> > > http://wiki.freenetproject.org/AnotherFreenetIndexFormat to try to fit
> > > your requirements, but I still need some explanations on this point:
> > > I don't really understand why indexes need to handle file splitting:
> > > FCPv2 specs specify that the node who does most of this work, no ?
> >
> > Splitting of the index itself. Because we will want to fetch only the
> > relevant pieces if it gets big. If we have a lot of freesites, we will
> > need to split the index up - perhaps by the first letter or two - in
> > order to avoid having to fetch very large files regularly. Users are
> > used to having to wait for search results with p2p, so this isn't
> > necessarily a big problem. The search engine would fetch only those
> > index parts needed for the particular search. Some letters would likely
> > have fewer words under them, in which case they could be aggregated.
> 
> I added a sub-indexes mechanism, assuming spliting is done on the first 
> letters of words.

That is what we want yes. We might have the number of letters be
variable even in a given index, since some prefixes only have a few
words in them...
> 
> > > > > > - Maybe include some amount of metadata - functional (mime type),
> > > > > > or theoretical (category, dublin core...), or other (activelinks?).
> > > > > > (This is definitely relevant to file indexes).
> > >
> > > Regarding "activelinks", what do you mean exactly ?
> >
> > 95x32 icons for freesites.
> >
> I added an option for that, but I'm not sure that was exactly what you meant.

They were very popular on 0.4/0.5.
> 
> > > > > > - Include the filename in the index. Possibly using negative word
> > > > > >   indexes to indicate "in the filename" words; it must be possible
> > > > > > to distinguish between matches in the page title and matches in the
> > > > > > content. (This is also relevant to both web page indexes and file
> > > > > > indexes, though especially to the latter).
> > >
> > > By filename, did you mean document titles ?
> >
> > No, I mean the filename - the URI. Which is what you will mostly be
> > searching on for searches for non-text files.
> >
> Hm, wouldn't it be more relevant to do an exception, and use titles at least 
> for HTML documents ?

Maybe both? For HTML the title is far more relevant...
-- 
Matthew J Toseland - toad at amphibian.dyndns.org
Freenet Project Official Codemonkey - http://freenetproject.org/
ICTHUS - Nothing is impossible. Our Boss says so.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: 
<https://emu.freenetproject.org/pipermail/devl/attachments/20060606/282c6270/attachment.pgp>

Reply via email to