A progress report on my project.

My SoC project originally had to do with improvements to XMLLibrarian
and XMLSpider to provide a better search experience to freenet users,
particularly helping with the issue of newcomers to freenet starting
up a search and seeing nothing happening for ages. Some of my original
targets have changed though as infinity0's work on new index formats
has meant I have not been having to change the index format to include
more metadata.

Some of my original targets and status:

Asynchronous searching - the current official version of XMLLibrarian
from the work I did in easter runs searches separately from showing
progress so that the page is not blank. In my forked version searches
for different terms and on different indexes are run at the same time
to speed this progress further.

Search progress - the current version shows the raw description of the
fetch progress from the ClientGetter's fetching the indexes. My newer
version uses ajax to update the progress and get the results, avoiding
screen refreshes if you have javascript enabled in your browser.
Progress bars are also shown for each fetch. I did have partial
results being shown as some of the fetches complete but it had
implications on performance and I had doubts on whether it would be of
use to anyone.

Result listing - commits last week were working on better displaying
of search results, including grouping SSK sites and hiding older
versions of them, showing uri's and USK links (I think people were
asking for that).

Search querys - the most recent work I have been doing, not & or
operations are working well (as far as my tests have gone), I have
been implementing phrase searching but it is not working for a reason
I am yet to determine.

This work is availiable in my fork
git://github.com/platy/plugin-XMLLibrarian-staging.git  it is ready to
be merged back into the freenet staging repository, hopefully into the
official as well after review.

Other items in my proposal that need to be done are:
Recording meta data in the spider, this will allow more information on
the search page and (some)search relevance ranking.
Use of filters other than the html one to allow other filetypes to be indexed.
Embedded search in freesites - allowing someone who has uploaded an
index to present a box on their site to start searches on it.

And importantly, I will be working with infinity0, to integrate the
Interdex distributed index system into the search interface and
crawling.

MikeB

Reply via email to