So the idea is to produce index pages and index sites and anything else
required? IMHO if it's to be used for search indexes (e.g. using the
Librarian plugin), it should support keyword indexing of the content of
freesites.

On Mon, May 08, 2006 at 04:10:00AM +0200, Jerome Flesch wrote:
> Greetings,
> 
> Please find below my second project proposal I've submit to Google.
> 
> 
> Project proposal
> ----------------
> 
> The main goal of this project would be the creation of a freesites and
> files spider.
> 
> 
> 1) My proposition
> 
> My idea would be to make two programs: The spider and a index
> viewer.
> 
> Spider would have to start indexing by a given Freenet URI, going from
> links to links. After reaching a given recursion depth, spider will
> restart from starting point, and update its index. It could be set to
> publish on Freenet, at a given rate, obtained index (for example,
> daily, every two days, etc). Spider could index files and freesites on
> different criterias: We could, for example, use meta-data for
> freesites, but for files like ogg, odt, mpg, etc, we could use in
> addition their internal tags. It would require a set of filters, one
> for each kind of file, but indexing would be more complete.
> 
> Use of a specific index format and an index viewer may offer various
> advantages: The main advantage is that once loaded, indexes can be
> sorted and displayed in different ways. For example, we can imagine a
> view in tree (for this one, we would need to care of links loops), or
> more simply, in list. The index viewer would provide the possibility
> to sort entries by a given meta-data fields order.
> 
> 
> 2) Technical aspect
> 
> For the spider, we have 2 possibilities: Make a plugin for the node,
> or make a separated program. I think a separated program would be
> better, because it would avoid to overload node.
> To limit bandwidth use, user will have to specify a given time between each
> request made by the spider.
> 
> To allow spider to parse a maximum of file formats, to find specific
> tags and new URI to explore, a plugin mechanism could be a good
> solution.
> 
> To avoid portability problems, I think making Spider and viewer in
> Java would be better. Viewer would then use Swing for the GUI.
> 
> 
> 3) Possible evolution
> 
> One interesting evolution would be to reuse already existing indexes:
> If one spider discovers an already existing index, it could make a
> link in its index to this one, and try avoid to index sites already
> indexed by this index.
> 
> One other interesting feature would be to allow user to export index
> in HTML and to upload it on Freenet. We can even imagine that spider
> could do it automatically each time it upload a new version of its
> index on Freenet.
> 
> 
> Brief biography
> ---------------
> 
> I'm 20 years old french. I'm currently studying software engineering
> at the UTBM, Universit? de Technologique de Belfort-Montb?liard
> (French University). I've already obtained a two-years technical
> degree (DUT) in Telecommunications and Networking.
> 
> During my DUT final training period which was at IrES (Subatomic
> Research Institute of Strasbourg, France), I had to work with various
> Java technologies, like Struts, OJB, Tomcat, etc.
> 
> Thanks to some university projects, I have already a good knowledge of
> Swing graphical interfaces [2].
> 
> Until now, my only participations to the Open Source movement was to
> write some articles about GrSecurity patch and Prelude Intrusion
> Detector. It's why, with the Google Summer of Code, I've seen a good
> opportunity to integrate an Open Source project as Freenet.
> 
> Until 1st July, I will have different exams and projects to return, so
> my availability may vary, but I will try to do my best to keep time
> for this project. After 1st July, I will be able to dedicate my whole
> time to this project.
> 
> 
> About this proposal
> -------------------
> 
> Even though this second proposal interests me, please consider that I
> would prefer, if it's possible, work on my first proposal (file
> upload and download utility).
> 
> 
> Best regards,
> 
> --
> Jerome Flesch.
> 
> 
> [1] 
> http://archives.freenetproject.org/message/20060504.164033.3c90cb65.en.html
> [2] https://jflesch.kwain.net/articles/90.php : One of my Java
>   university project : A train / bus / subway / tramway network
>   simulator.
> _______________________________________________
> Tech mailing list
> Tech at freenetproject.org
> http://emu.freenetproject.org/cgi-bin/mailman/listinfo/tech
> 

-- 
Matthew J Toseland - toad at amphibian.dyndns.org
Freenet Project Official Codemonkey - http://freenetproject.org/
ICTHUS - Nothing is impossible. Our Boss says so.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: 
<https://emu.freenetproject.org/pipermail/tech/attachments/20060508/a6144041/attachment.pgp>

Reply via email to