May be you should commit this to a scratch area so that others could have a look.

Samisa...

Supun Kamburugamuva wrote:
Hi List,

I have managed to get this thing working. The purpose of this implementation is to avoid duplicating strings from the parser to the axiom level. Guththila keeps a buffer of the incoming XML. Instead of duplicating XML Strings from this buffer Guththila can give a pointer to the starting position of the string in the buffer. We need to build Axiom by using these pointers and Axiom shoudn't assume the ownership of these strings. This is the brief summary of what I have done.

*Axiom Model Level*

Basically I have introduced a Boolean flag to virtually every structure on Axiom as well axutil_qname_t structure. This flag indicates weather we are allowed to free the string buffers or not. Here is an example.

 struct axiom_comment
{

    /** comment text */
    axutil_string_t *value;
    /* True if we are allowed to free string buffers */
    axis2_bool_t owns_strings;
};

Then I have introduced a new create method for all the structures in om. This method creates the structure without assuming the ownership of the strings. Following is the new method for axiom_element_t structure.

axiom_element_create_nos(params same as normal create method)

I don't feel right about the name of the method (nos means "not owns strings"). So I would like to here a more readable name from you guys. This create method sets the owns_strings to FALSE and all the getter methods and setter methods were changed to act according to the owns_strings field. These are the only changes to the Axiom level and as you can see no API changes. Just few additional API methods.

*Stax Level
*
I have introduced the "owns_strings" Boolean flag to the axiom_stax_builder_t structure. If this is FALSE, builder will build the tree without assuming the ownership of the strings. So I have introduced a new create method for stax builder as well.

AXIS2_EXTERN axiom_stax_builder_t *AXIS2_CALL
axiom_stax_builder_create_nos(
    const axutil_env_t * env,
    axiom_xml_reader_t * parser);

In all the stax builder methods the owns_strings flag is checked and appropriate methods are called to build the om tree. For example if this flag is FALSE the newly added axiom_element_create_nos will be called instead of axiom_element_create_str.

*Parser Level*

I have introduced a new method to the axiom_xml_reader API. The method is

    AXIS2_EXTERN axis2_status_t AXIS2_CALL
    axiom_xml_reader_set_duplicate_strings(
        axiom_xml_reader_t * parser,
        const axutil_env_t * env,
        axis2_bool_t is_duplicate);

This method will try to set the parser to not to duplicate strings. If the method is unsuccessful (in the case of Libxml2) this method will return false. The advantage of this method is that depending on the return value of this method we can create the appropriate stax builder (one that uses duplicated strings or one that uses strings as pointers to a buffer).

The implementation is at an experimental level. All the samples are working but there are memory leaks in the system. Also I haven't check the implementation with Libxml2. We need to do a performance test and see weather this gives a good performance gain as well.

Regards,
Supun.


On Mon, Jun 2, 2008 at 6:16 AM, Supun Kamburugamuva <[EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]>> wrote:



    AFAIK yes we can do this. The only thing is we need to introduce
    the axutil_string_t to every structure in axiom/C for handling the
    strings (most of these are already done). When we create these
    strings we set the owns_buffer to false. When Axiom/C free the
    axutil_string_t structure it won't free the actual buffer. We can
    free the actual buffer at the end of the processing of the request
    and every string buffer will be free at once.

    Also with the current design all the methods in Axiom/C returns
    the strings to the users but the ownership of those strings
    remains with Axiom/C. So hopefully we won't have to worry about
    that also.

    Supun.


    On Mon, Jun 2, 2008 at 4:51 AM, Samisa Abeysinghe <[EMAIL PROTECTED]
    <mailto:[EMAIL PROTECTED]>> wrote:

        Supun Kamburugamuva wrote:

            Hi All,

            At the moment Axis2/C is considerably faster than its Java
            implementation. We have run performance tests for Axis2/C
            and the results are promising. But we believe that we can
            improve the Axis2/C performance to a much higher level
            than what it has achieved.

            I have done profiling for Axis2/C on a Windows machine
            using the Benchmark service and Apache "ab". These tests
            were done using both Apache web server (httpd) and Simple
            Http Server. These results didn't show any routines which
            cause bottlenecks in the Axis2/C. Also Guththila is at its
            best and I cannot think of any major improvements to the
            Guththila which can cause a major performance improvement.

            So the conclusion is that if we want to improve the
            performance we need to do it in a distributed way. We need
            to improve the little things that are unnoticed and not
            cared for that adds up at the end.

            There is another way that we may be able to improve the
            performance of Axis2/C drastically. But to do that we need
            to do design level changes to Axis2/C. I will explain one
            such improvement that can be done.

            When Guththila parses a XML, it tokenize the buffer.
            Please note that these tokens are not strings, they are
            pointers to the actual XML buffer. But Axiom/C is designed
            in such a way that it needs a copy of every XML entity
            (element name, attribute name, name space, element prefix,
            text value etc) for it to keep. So when Axiom/C model is
            build from the Guththila parser, Guththila copies the
            strings from the XML buffer to a new string (this requires
            new memory allocation and memory copying). Here we are
            duplicating the information that we already have. This
            gives cleaner design and a robust code (Java style of
            doing things). But it is not a good thing in a performance
            point of view.

            So in the new design, Axiom/C can be built without
            assuming ownership of the strings. Here we are not passing
            a copy of a string from parser to Axiom/C. Instead a
            pointer to the buffer(starting positions of the string) is
            passed. Axiom/C won't free this new string because it
            doesn't have ownership to the strings. At the end we free
            the XML buffer and we don't have to worry about freeing
            each and every individual strings. Here we don't need to
            change anything beyond the Axiom/C level. Also Guththila
            is designed to handle this and we don't need to do any
            changes to Guththila.


        This can be done without changing the current API?

        Samisa...


            Another improvement is to reuse the things that are used
            always. I haven't looked at this at a deep level but for
            starting things, we may be able to reuse things like SOAP
            name space strings, envelope name, header name etc.

            Regards,
            Supun..

            
------------------------------------------------------------------------


            No virus found in this incoming message.
            Checked by AVG. Version: 8.0.100 / Virus Database:
            269.24.4/1477 - Release Date: 6/1/2008 5:28 PM


-- Samisa Abeysinghe Director, Engineering; WSO2 Inc.

        http://www.wso2.com/ - "The Open Source SOA Company"


        ---------------------------------------------------------------------
        To unsubscribe, e-mail: [EMAIL PROTECTED]
        <mailto:[EMAIL PROTECTED]>
        For additional commands, e-mail: [EMAIL PROTECTED]
        <mailto:[EMAIL PROTECTED]>



------------------------------------------------------------------------


No virus found in this incoming message.
Checked by AVG. Version: 8.0.100 / Virus Database: 270.0.0/1489 - Release Date: 6/7/2008 11:17 AM


--
Samisa Abeysinghe Director, Engineering; WSO2 Inc.

http://www.wso2.com/ - "The Open Source SOA Company"


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to