RE: Possible improvements to Axis2/C

Bill Mitchell Tue, 10 Jun 2008 08:33:24 -0700

Supun, when I read your proposal, I started to wonder how long the
memory/buffer containing the strings will persist, since as soon as the
underlying buffer is freed or reused the axiom tree using pointers to these
strings will become invalid.  I am guessing that the buffer is freed or
reused when the next request is issued, and you are assuming that at that
time the tree structure from the last request has also been freed.


 

My particular concern is with the logic in axiom_node_detach.  This is used
to transfer ownership of a subtree from the axiom tree to the caller.  Among
other users of this routine, the generated adb classes use axiom_node_detach
to return to the application the subtree corresponding to WSDL data of type
Any.  The application may retain ownership of these returned subtrees for
long periods, well beyond the next request.  

 

To maintain the current API, I infer that axiom_node_detach will need to
traverse the entire subtree that is being detached, look for any strings
that are not owned, and replace them with new copies of the strings that are
owned.  So, as well as the new routines you suggested like
axiom_element_create_nos, won't you need another set of routines like
axiom_element_force_ownership for each axiom structure?  Axiom_node_detach
could invoke these force_ownership routines on each element of the subtree
to make sure that every unowned string is replaced.  

 

Regards,

Bill Mitchell

 

From: Supun Kamburugamuva [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, June 10, 2008 12:49 AM
To: Apache AXIS C Developers List
Subject: Re: Possible improvements to Axis2/C

 

Hi List,

I have managed to get this thing working. The purpose of this implementation
is to avoid duplicating strings from the parser to the axiom level.
Guththila keeps a buffer of the incoming XML. Instead of duplicating XML
Strings from this buffer Guththila can give a pointer to the starting
position of the string in the buffer. We need to build Axiom by using these
pointers and Axiom shoudn't assume the ownership of these strings. This is
the brief summary of what I have done.

Axiom Model Level

Basically I have introduced a Boolean flag to virtually every structure on
Axiom as well axutil_qname_t structure. This flag indicates weather we are
allowed to free the string buffers or not. Here is an example.

 struct axiom_comment
{

    /** comment text */
    axutil_string_t *value;
    /* True if we are allowed to free string buffers */
    axis2_bool_t owns_strings;
};

Then I have introduced a new create method for all the structures in om.
This method creates the structure without assuming the ownership of the
strings. Following is the new method for axiom_element_t structure.

axiom_element_create_nos(params same as normal create method)

I don't feel right about the name of the method (nos means "not owns
strings"). So I would like to here a more readable name from you guys.
This create method sets the owns_strings to FALSE and all the getter methods
and setter methods were changed to act according to the owns_strings field.
These are the only changes to the Axiom level and as you can see no API
changes. Just few additional API methods. 

Stax Level

I have introduced the "owns_strings" Boolean flag to the
axiom_stax_builder_t structure. If this is FALSE, builder will build the
tree without assuming the ownership of the strings. So I have introduced a
new create method for stax builder as well.

AXIS2_EXTERN axiom_stax_builder_t *AXIS2_CALL
axiom_stax_builder_create_nos(
    const axutil_env_t * env,
    axiom_xml_reader_t * parser);

In all the stax builder methods the owns_strings flag is checked and
appropriate methods are called to build the om tree. For example if this
flag is FALSE the newly added axiom_element_create_nos will be called
instead of axiom_element_create_str. 

Parser Level

I have introduced a new method to the axiom_xml_reader API. The method is 

    AXIS2_EXTERN axis2_status_t AXIS2_CALL
    axiom_xml_reader_set_duplicate_strings(
        axiom_xml_reader_t * parser,
        const axutil_env_t * env,
        axis2_bool_t is_duplicate);

This method will try to set the parser to not to duplicate strings. If the
method is unsuccessful (in the case of Libxml2) this method will return
false. The advantage of this method is that depending on the return value of
this method we can create the appropriate stax builder (one that uses
duplicated strings or one that uses strings as pointers to a buffer).

The implementation is at an experimental level. All the samples are working
but there are memory leaks in the system. Also I haven't check the
implementation with Libxml2.  We need to do a performance test and see
weather this gives a good performance gain as well. 

Regards,
Supun.



On Mon, Jun 2, 2008 at 6:16 AM, Supun Kamburugamuva <[EMAIL PROTECTED]>
wrote:



AFAIK yes we can do this. The only thing is we need to introduce the
axutil_string_t to every structure in axiom/C for handling the strings (most
of these are already done). When we create these strings we set the
owns_buffer to false. When Axiom/C free the axutil_string_t structure it
won't free the actual buffer. We can free the actual buffer at the end of
the processing of the request and every string buffer will be free at once.

Also with the current design all the methods in Axiom/C returns the strings
to the users but the ownership of those strings remains with Axiom/C. So
hopefully we won't have to worry about that also.

Supun.

 

On Mon, Jun 2, 2008 at 4:51 AM, Samisa Abeysinghe <[EMAIL PROTECTED]> wrote:

Supun Kamburugamuva wrote:

Hi All,

At the moment Axis2/C is considerably faster than its Java implementation.
We have run performance tests for Axis2/C and the results are promising. But
we believe that we can improve the Axis2/C performance to a much higher
level than what it has achieved.

I have done profiling for Axis2/C on a Windows machine using the Benchmark
service and Apache "ab". These tests were done using both Apache web server
(httpd) and Simple Http Server. These results didn't show any routines which
cause bottlenecks in the Axis2/C. Also Guththila is at its best and I cannot
think of any major improvements to the Guththila which can cause a major
performance improvement.

So the conclusion is that if we want to improve the performance we need to
do it in a distributed way. We need to improve the little things that are
unnoticed and not cared for that adds up at the end.

There is another way that we may be able to improve the performance of
Axis2/C drastically. But to do that we need to do design level changes to
Axis2/C. I will explain one such improvement that can be done.

When Guththila parses a XML, it tokenize the buffer. Please note that these
tokens are not strings, they are pointers to the actual XML buffer. But
Axiom/C is designed in such a way that it needs a copy of every XML entity
(element name, attribute name, name space, element prefix, text value etc)
for it to keep. So when Axiom/C model is build from the Guththila parser,
Guththila copies the strings from the XML buffer to a new string (this
requires new memory allocation and memory copying). Here we are duplicating
the information that we already have. This gives cleaner design and a robust
code (Java style of doing things). But it is not a good thing in a
performance point of view.

So in the new design, Axiom/C can be built without assuming ownership of the
strings. Here we are not passing a copy of a string from parser to Axiom/C.
Instead a pointer to the buffer(starting positions of the string) is passed.
Axiom/C won't free this new string because it doesn't have ownership to the
strings. At the end we free the XML buffer and we don't have to worry about
freeing each and every individual strings. Here we don't need to change
anything beyond the Axiom/C level. Also Guththila is designed to handle this
and we don't need to do any changes to Guththila.

 

This can be done without changing the current API?

Samisa...


Another improvement is to reuse the things that are used always. I haven't
looked at this at a deep level but for starting things, we may be able to
reuse things like SOAP name space strings, envelope name, header name etc.

Regards,
Supun..

------------------------------------------------------------------------


No virus found in this incoming message.
Checked by AVG. Version: 8.0.100 / Virus Database: 269.24.4/1477 - Release
Date: 6/1/2008 5:28 PM
 



-- 
Samisa Abeysinghe Director, Engineering; WSO2 Inc.

http://www.wso2.com/ - "The Open Source SOA Company"


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

RE: Possible improvements to Axis2/C

Reply via email to