Hi all, I'm a postgraduate student from Fudan University, Shanghai, China. This is my first time joining GSoC and I was not quite clear that I should exchange my ideas with possible mentors. I've submitted my proposal today. It's lucky that I can still modify it. Here is my proposal, any criticism and suggestions are welcome~
================================================ *Abstract: * The main idea dealing with the subject is to treat XML DOM structure as a DOM tree and translate the problem to computing diffs between tree structures. Some algorithms exist for tree diff computing, such as Tree Edit Distance. Some small modification should be made to adapt the algorithm to the context. *Detailed Description: * The implementation of the module can be divided into 4 parts: 1. Parse the XML text to get the DOM structure; 2. Translate the DOM structure to tree structure; 3. Employ some algorithm to computer the diffs; 4. Translate the tree diffs to XML diffs; 5. Display the diffs and maybe mail them. *Initial Algorithm Design* According to my past research experience, Tree Edit Distance is a class of algorithms that using edit distance to measure tree similarity. The algorithms define 3 types of edit operations on labled tree: insert, delete and relabling. To measure the distance, the algorithms assign weights to operations, and define the minimum weight summary of all possible edit sequences between two trees as the edit distance.There is a corresponding best edit sequence with the minimum weight. The sequence can be translated to describe the diffs between XML texts. *Draft Timeline* - Week 1 Complete a survey in the related area to decide the algorithm to employ; - Week 2-3 Implement the module of the XML parser and translater; - Week 4-6 Implement the algorithm chozen to compute tree diffs; - Week 7-8 Implement the module which translate the tree diffs to XML diffs and display them; - Week 9 Implement the module which can mail the diffs to certain mail address; - Week 10 Debug the whole module and make necessary modifications to successfully complete the subject. *Additional Information:* I've been learning and using Java since 3 years ago. Although my experience in dealing with XML text with Java is not that vast, my knowledge in programming, software architecture and algorithm can help me to learn fast and handle the problem. I'm 23 years old, living in Shanghai, China, attending Fudan University. ================================================
