On Saturday 06 June 2009 20:11:58 Matthew Toseland wrote:
> The current index works, but it is seriously broken in terms of first time
> usability. Specifically, because of a bug in XMLSpider that has been fixed,
> but unfortunately wAnnA hasn't been updated, it has a completely insane
> structure:
> - 15x indexes 1-f of size tens of megabytes. If the word is in one of these
> (which it is 93% of the time), the librarian must download 200+ blocks before
> displaying results. The upside is these are very popular, but even on a new
> node with lots of connections it takes considerable time to fetch them.
> - Approx 256x indexes 0[0-9a-f][0-9a-f], which are tiny. Most of these are
> also managing to persist, but not all of them.
>
> Because the spider takes weeks to fetch everything from scratch, it is
> unlikely that anyone will be able to insert a new index before we need to
> release. The obvious option is to take the existing data and reorganise it
> into smaller chunks, but if I was to do that I would be inserting an index,
> and I don't think that's a good idea, because 1) it exposes us to additional
> legal risk, doesn't it? and 2) I definitely don't want to run the spider in
> the long term, unless it's vital, and that would definitely be legally risky;
> and publishing it once would mean we had to move XMLLibrarian to my SSK...
>
> Solutions? If any anonymous person happens to have a big librarian index, or
> is able to do the reorganisation I mentioned (you need to split the words by
> md5, and you need to put in the site data for only the words that are in that
> subindex), inserting a new librarian index and announcing it anonymously
> would be *really* helpful right now.
>
For now, the solution is to have the search form open the search in a new
window (magically converted into a new tab by firefox, and hopefully other
browsers), and to warn the user about this. This isn't a great solution, but it
is better than nothing.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 835 bytes
Desc: This is a digitally signed message part.
URL:
<https://emu.freenetproject.org/pipermail/devl/attachments/20090606/e6df92b1/attachment.pgp>