Hi,

We have observed a few scenarios with Developer Studio where the
artifact.xml files and pom.xml files of CApp projects, are getting
corrupted while a user is doing a refactoring operation such as
deleting/renaming project artifacts. After analyzing those scenarios, we
have identified the root causes and below is a report on them and possible
solutions to overcome them properly.


*Problem 1: Deleting multiple artifacts*
When deleting/renaming multiple artifacts in a project at the same
time, artifact.xml file of the project and pom.xml files of CApp projects
which include those artifacts, are getting corrupted sometimes.

For example, think of a scenario where we want to delete multiple registry
resources from a registry project and those resources are included in a
CApp project opened in the same work-space.

When you go to the delete preview window, it shows required changes to
pom.xml of the CApp project and artifact.xml of the registry project,
correctly. Yet at the moment those changes are applied, file gets
corrupted.

Eclipse applies the required changes to a particular file (the list you can
see in delete preview window), *one by one*. To apply them, Eclipse
refactoring API uses text token meta-data - offsets, lengths (and replace
tokens if it is a rename operation) - defined in each change object (delete
preview also uses the same meta-data when displaying diffs for each file).
it has no knowledge about XML content. It is only a text replace API.
The problem is, when we have multiple change objects for a single file,
when the first change is applied to the file, the offsets defined in
subsequent change objects for the same file, are no longer valid. They need
to be re-synced with the modified file by first change. And it goes on same
for each subsequent change.

Since Eclipse is not capable of syncing the offsets as such, we finally get
a corrupted XML file with random strings.

*Solution:*

Syncing up offsets after a change by change is not a salable solution and
anyway Eclipse Refactoring API offers no ways to get it done.

The key point is not have multiple change objects for the same file. A
single change object is capable of containing metadata for multiple tokens.
If we can grab the list of all files which are going to be deleted/renamed
within a single refactoring operation, we can  group all the required
changes by file Id and make single change object - one per each file that
needs modifications.

This achievable with the ISharableParticipant
<http://help.eclipse.org/luna/index.jsp?topic=%2Forg.eclipse.platform.doc.isv%2Freference%2Fapi%2Forg%2Feclipse%2Fltk%2Fcore%2Frefactoring%2Fparticipants%2FISharableParticipant.html>
[1]  interface. This allows us to bind a single RefactoringParticipant
object for a list of files that are going to be refactored. The
createChange method of a shared participant will only be called after all
the files are processed, hence we have time to group all required changes
by file and make composite change objects for each file.



*Problem 2: User modifying aritifact.xml *
When a user has modified a meta file of a project manually, while deleting
or renaming artifacts in that project, meta file gets corrupted randomly.

As I have pointed out previously, we currently use TextFileChange API of
eclipse to implement the refactoring model for these XML meta files. This
API is not XML aware - it only takes token offsets and lengths as
arguments. Furthermore, we are currently using regular expressions and
string search APIs (eg. indexOf) to find token offsets within the meta
files. When a user modifies these XML meta files manually, there seems to
be situations that these regex searches are not working/giving correct
offsets due to changed formatting of the file.


*Solution:*

There is a possibility for fine tuning these regular expressions to
overcome this. However, it is not a reliable solution as we cannot 100%
guarantee that we could capture each possible scenario with regular
expressions.

On the other hand, it is possible to implement a XML aware FileChange
change API using a XML parser as the base. Then, instead of doing regex
based text searches, we can use xpath expressions as the base for change.
The underline parser - regardless of the formatting changes on file - will
take care of replacing or removing the elements matched with the xpath.
However, we again face the issue of keeping the text formatting untouched
during the serialization and de-serialization of XML file as we need to
keep the diff clean for SCM provider. Non-extractive XML parsing [2]  comes
into picture here.

We can utilize a non-extractive XML parsing library such as VTD-XML [3] to
over come the issue of modifying whole XML file. Since VTD-XML API already
deals with offsets and lengths, we can implement a hybrid solution by
extending the TextFileChange API to utilize VTD-XML for parsing the XML and
identifying offsets of the elements which needs to be replaced/removed.


Please share your thoughts on this.


[1] Sharable Participants :
http://help.eclipse.org/luna/index.jsp?topic=%2Forg.eclipse.platform.doc.isv%2Freference%2Fapi%2Forg%2Feclipse%2Fltk%2Fcore%2Frefactoring%2Fparticipants%2FISharableParticipant.html

[2] Non-extractive XML parsing :
http://www.xml.com/pub/a/2004/05/19/parsing.html

[3] VTD-XML project : http://vtd-xml.sourceforge.net/


Thanks,

*Kavith Lokuhewage*
Senior Software Engineer
WSO2 Inc. - http://wso2.com
lean . enterprise . middleware
Mobile - +94779145123
Linkedin <http://www.linkedin.com/pub/kavith-lokuhewage/49/473/419>  Twitter
<https://twitter.com/KavithThiranga>
_______________________________________________
Architecture mailing list
Architecture@wso2.org
https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture

Reply via email to