Greetings,

Please find below my second project proposal I've submit to Google.


Project proposal
----------------

The main goal of this project would be the creation of a freesites and
files spider.


1) My proposition

My idea would be to make two programs: The spider and a index
viewer.

Spider would have to start indexing by a given Freenet URI, going from
links to links. After reaching a given recursion depth, spider will
restart from starting point, and update its index. It could be set to
publish on Freenet, at a given rate, obtained index (for example,
daily, every two days, etc). Spider could index files and freesites on
different criterias: We could, for example, use meta-data for
freesites, but for files like ogg, odt, mpg, etc, we could use in
addition their internal tags. It would require a set of filters, one
for each kind of file, but indexing would be more complete.

Use of a specific index format and an index viewer may offer various
advantages: The main advantage is that once loaded, indexes can be
sorted and displayed in different ways. For example, we can imagine a
view in tree (for this one, we would need to care of links loops), or
more simply, in list. The index viewer would provide the possibility
to sort entries by a given meta-data fields order.


2) Technical aspect

For the spider, we have 2 possibilities: Make a plugin for the node,
or make a separated program. I think a separated program would be
better, because it would avoid to overload node.
To limit bandwidth use, user will have to specify a given time between each
request made by the spider.

To allow spider to parse a maximum of file formats, to find specific
tags and new URI to explore, a plugin mechanism could be a good
solution.

To avoid portability problems, I think making Spider and viewer in
Java would be better. Viewer would then use Swing for the GUI.


3) Possible evolution

One interesting evolution would be to reuse already existing indexes:
If one spider discovers an already existing index, it could make a
link in its index to this one, and try avoid to index sites already
indexed by this index.

One other interesting feature would be to allow user to export index
in HTML and to upload it on Freenet. We can even imagine that spider
could do it automatically each time it upload a new version of its
index on Freenet.


Brief biography
---------------

I'm 20 years old french. I'm currently studying software engineering
at the UTBM, Universit? de Technologique de Belfort-Montb?liard
(French University). I've already obtained a two-years technical
degree (DUT) in Telecommunications and Networking.

During my DUT final training period which was at IrES (Subatomic
Research Institute of Strasbourg, France), I had to work with various
Java technologies, like Struts, OJB, Tomcat, etc.

Thanks to some university projects, I have already a good knowledge of
Swing graphical interfaces [2].

Until now, my only participations to the Open Source movement was to
write some articles about GrSecurity patch and Prelude Intrusion
Detector. It's why, with the Google Summer of Code, I've seen a good
opportunity to integrate an Open Source project as Freenet.

Until 1st July, I will have different exams and projects to return, so
my availability may vary, but I will try to do my best to keep time
for this project. After 1st July, I will be able to dedicate my whole
time to this project.


About this proposal
-------------------

Even though this second proposal interests me, please consider that I
would prefer, if it's possible, work on my first proposal (file
upload and download utility).


Best regards,

--
Jerome Flesch.


[1] 
http://archives.freenetproject.org/message/20060504.164033.3c90cb65.en.html
[2] https://jflesch.kwain.net/articles/90.php : One of my Java
  university project : A train / bus / subway / tramway network
  simulator.

Reply via email to