Greetings, Please find below my second project proposal I've submit to Google.
Project proposal ---------------- The main goal of this project would be the creation of a freesites and files spider. 1) My proposition My idea would be to make two programs: The spider and a index viewer. Spider would have to start indexing by a given Freenet URI, going from links to links. After reaching a given recursion depth, spider will restart from starting point, and update its index. It could be set to publish on Freenet, at a given rate, obtained index (for example, daily, every two days, etc). Spider could index files and freesites on different criterias: We could, for example, use meta-data for freesites, but for files like ogg, odt, mpg, etc, we could use in addition their internal tags. It would require a set of filters, one for each kind of file, but indexing would be more complete. Use of a specific index format and an index viewer may offer various advantages: The main advantage is that once loaded, indexes can be sorted and displayed in different ways. For example, we can imagine a view in tree (for this one, we would need to care of links loops), or more simply, in list. The index viewer would provide the possibility to sort entries by a given meta-data fields order. 2) Technical aspect For the spider, we have 2 possibilities: Make a plugin for the node, or make a separated program. I think a separated program would be better, because it would avoid to overload node. To limit bandwidth use, user will have to specify a given time between each request made by the spider. To allow spider to parse a maximum of file formats, to find specific tags and new URI to explore, a plugin mechanism could be a good solution. To avoid portability problems, I think making Spider and viewer in Java would be better. Viewer would then use Swing for the GUI. 3) Possible evolution One interesting evolution would be to reuse already existing indexes: If one spider discovers an already existing index, it could make a link in its index to this one, and try avoid to index sites already indexed by this index. One other interesting feature would be to allow user to export index in HTML and to upload it on Freenet. We can even imagine that spider could do it automatically each time it upload a new version of its index on Freenet. Brief biography --------------- I'm 20 years old french. I'm currently studying software engineering at the UTBM, Universit? de Technologique de Belfort-Montb?liard (French University). I've already obtained a two-years technical degree (DUT) in Telecommunications and Networking. During my DUT final training period which was at IrES (Subatomic Research Institute of Strasbourg, France), I had to work with various Java technologies, like Struts, OJB, Tomcat, etc. Thanks to some university projects, I have already a good knowledge of Swing graphical interfaces [2]. Until now, my only participations to the Open Source movement was to write some articles about GrSecurity patch and Prelude Intrusion Detector. It's why, with the Google Summer of Code, I've seen a good opportunity to integrate an Open Source project as Freenet. Until 1st July, I will have different exams and projects to return, so my availability may vary, but I will try to do my best to keep time for this project. After 1st July, I will be able to dedicate my whole time to this project. About this proposal ------------------- Even though this second proposal interests me, please consider that I would prefer, if it's possible, work on my first proposal (file upload and download utility). Best regards, -- Jerome Flesch. [1] http://archives.freenetproject.org/message/20060504.164033.3c90cb65.en.html [2] https://jflesch.kwain.net/articles/90.php : One of my Java university project : A train / bus / subway / tramway network simulator.
