Hi all,
   I'm a postgraduate student from Fudan University, Shanghai, China.
   This is my first time joining GSoC and I was not quite clear that I
should exchange my ideas with possible mentors. I've submitted my proposal
today. It's lucky that I can still modify it.
   Here is my proposal, any criticism and suggestions are welcome~

================================================

*Abstract: *

The main idea dealing with the subject is to treat XML DOM structure as a
DOM tree and translate the problem to computing diffs between tree
structures. Some algorithms exist for tree diff computing, such as Tree Edit
Distance. Some small modification should be made to adapt the algorithm to
the context.

*Detailed Description: *

The implementation of the module can be divided into 4 parts:

   1. Parse the XML text to get the DOM structure;
   2. Translate the DOM structure to tree structure;
   3. Employ some algorithm to computer the diffs;
   4. Translate the tree diffs to XML diffs;
   5. Display the diffs and maybe mail them.

*Initial Algorithm Design*

According to my past research experience, Tree Edit Distance is a class of
algorithms that using edit distance to measure tree similarity. The
algorithms define 3 types of edit operations on labled tree: insert, delete
and relabling. To measure the distance, the algorithms assign weights to
operations, and define the minimum weight summary of all possible edit
sequences between two trees as the edit distance.There is a corresponding
best edit sequence with the minimum weight. The sequence can be translated
to describe the diffs between XML texts.

*Draft Timeline*

   - Week 1 Complete a survey in the related area to decide the
   algorithm to employ;
   - Week 2-3 Implement the module of the XML parser and translater;
   - Week 4-6 Implement the algorithm chozen to compute tree diffs;
   - Week 7-8 Implement the module which translate the tree diffs to XML
   diffs and display them;
   - Week 9 Implement the module which can mail the diffs to certain mail
   address;
   - Week 10 Debug the whole module and make necessary modifications
   to successfully complete the subject.

*Additional Information:*

I've been learning and using Java since 3 years ago. Although my experience
in dealing with XML text with Java is not that vast, my knowledge in
programming, software architecture and algorithm can help me to learn fast
and handle the problem.

I'm 23 years old, living in Shanghai, China, attending Fudan University.

================================================

Reply via email to