May be you should commit this to a scratch area so that others could
have a look.
Samisa...
Supun Kamburugamuva wrote:
Hi List,
I have managed to get this thing working. The purpose of this
implementation is to avoid duplicating strings from the parser to the
axiom level. Guththila keeps a buffer of the incoming XML. Instead of
duplicating XML Strings from this buffer Guththila can give a pointer
to the starting position of the string in the buffer. We need to build
Axiom by using these pointers and Axiom shoudn't assume the ownership
of these strings. This is the brief summary of what I have done.
*Axiom Model Level*
Basically I have introduced a Boolean flag to virtually every
structure on Axiom as well axutil_qname_t structure. This flag
indicates weather we are allowed to free the string buffers or not.
Here is an example.
struct axiom_comment
{
/** comment text */
axutil_string_t *value;
/* True if we are allowed to free string buffers */
axis2_bool_t owns_strings;
};
Then I have introduced a new create method for all the structures in
om. This method creates the structure without assuming the ownership
of the strings. Following is the new method for axiom_element_t structure.
axiom_element_create_nos(params same as normal create method)
I don't feel right about the name of the method (nos means "not owns
strings"). So I would like to here a more readable name from you guys.
This create method sets the owns_strings to FALSE and all the getter
methods and setter methods were changed to act according to the
owns_strings field.
These are the only changes to the Axiom level and as you can see no
API changes. Just few additional API methods.
*Stax Level
*
I have introduced the "owns_strings" Boolean flag to the
axiom_stax_builder_t structure. If this is FALSE, builder will build
the tree without assuming the ownership of the strings. So I have
introduced a new create method for stax builder as well.
AXIS2_EXTERN axiom_stax_builder_t *AXIS2_CALL
axiom_stax_builder_create_nos(
const axutil_env_t * env,
axiom_xml_reader_t * parser);
In all the stax builder methods the owns_strings flag is checked and
appropriate methods are called to build the om tree. For example if
this flag is FALSE the newly added axiom_element_create_nos will be
called instead of axiom_element_create_str.
*Parser Level*
I have introduced a new method to the axiom_xml_reader API. The method is
AXIS2_EXTERN axis2_status_t AXIS2_CALL
axiom_xml_reader_set_duplicate_strings(
axiom_xml_reader_t * parser,
const axutil_env_t * env,
axis2_bool_t is_duplicate);
This method will try to set the parser to not to duplicate strings. If
the method is unsuccessful (in the case of Libxml2) this method will
return false. The advantage of this method is that depending on the
return value of this method we can create the appropriate stax builder
(one that uses duplicated strings or one that uses strings as pointers
to a buffer).
The implementation is at an experimental level. All the samples are
working but there are memory leaks in the system. Also I haven't check
the implementation with Libxml2. We need to do a performance test and
see weather this gives a good performance gain as well.
Regards,
Supun.
On Mon, Jun 2, 2008 at 6:16 AM, Supun Kamburugamuva <[EMAIL PROTECTED]
<mailto:[EMAIL PROTECTED]>> wrote:
AFAIK yes we can do this. The only thing is we need to introduce
the axutil_string_t to every structure in axiom/C for handling the
strings (most of these are already done). When we create these
strings we set the owns_buffer to false. When Axiom/C free the
axutil_string_t structure it won't free the actual buffer. We can
free the actual buffer at the end of the processing of the request
and every string buffer will be free at once.
Also with the current design all the methods in Axiom/C returns
the strings to the users but the ownership of those strings
remains with Axiom/C. So hopefully we won't have to worry about
that also.
Supun.
On Mon, Jun 2, 2008 at 4:51 AM, Samisa Abeysinghe <[EMAIL PROTECTED]
<mailto:[EMAIL PROTECTED]>> wrote:
Supun Kamburugamuva wrote:
Hi All,
At the moment Axis2/C is considerably faster than its Java
implementation. We have run performance tests for Axis2/C
and the results are promising. But we believe that we can
improve the Axis2/C performance to a much higher level
than what it has achieved.
I have done profiling for Axis2/C on a Windows machine
using the Benchmark service and Apache "ab". These tests
were done using both Apache web server (httpd) and Simple
Http Server. These results didn't show any routines which
cause bottlenecks in the Axis2/C. Also Guththila is at its
best and I cannot think of any major improvements to the
Guththila which can cause a major performance improvement.
So the conclusion is that if we want to improve the
performance we need to do it in a distributed way. We need
to improve the little things that are unnoticed and not
cared for that adds up at the end.
There is another way that we may be able to improve the
performance of Axis2/C drastically. But to do that we need
to do design level changes to Axis2/C. I will explain one
such improvement that can be done.
When Guththila parses a XML, it tokenize the buffer.
Please note that these tokens are not strings, they are
pointers to the actual XML buffer. But Axiom/C is designed
in such a way that it needs a copy of every XML entity
(element name, attribute name, name space, element prefix,
text value etc) for it to keep. So when Axiom/C model is
build from the Guththila parser, Guththila copies the
strings from the XML buffer to a new string (this requires
new memory allocation and memory copying). Here we are
duplicating the information that we already have. This
gives cleaner design and a robust code (Java style of
doing things). But it is not a good thing in a performance
point of view.
So in the new design, Axiom/C can be built without
assuming ownership of the strings. Here we are not passing
a copy of a string from parser to Axiom/C. Instead a
pointer to the buffer(starting positions of the string) is
passed. Axiom/C won't free this new string because it
doesn't have ownership to the strings. At the end we free
the XML buffer and we don't have to worry about freeing
each and every individual strings. Here we don't need to
change anything beyond the Axiom/C level. Also Guththila
is designed to handle this and we don't need to do any
changes to Guththila.
This can be done without changing the current API?
Samisa...
Another improvement is to reuse the things that are used
always. I haven't looked at this at a deep level but for
starting things, we may be able to reuse things like SOAP
name space strings, envelope name, header name etc.
Regards,
Supun..
------------------------------------------------------------------------
No virus found in this incoming message.
Checked by AVG. Version: 8.0.100 / Virus Database:
269.24.4/1477 - Release Date: 6/1/2008 5:28 PM
--
Samisa Abeysinghe Director, Engineering; WSO2 Inc.
http://www.wso2.com/ - "The Open Source SOA Company"
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
<mailto:[EMAIL PROTECTED]>
For additional commands, e-mail: [EMAIL PROTECTED]
<mailto:[EMAIL PROTECTED]>
------------------------------------------------------------------------
No virus found in this incoming message.
Checked by AVG.
Version: 8.0.100 / Virus Database: 270.0.0/1489 - Release Date: 6/7/2008 11:17 AM
--
Samisa Abeysinghe
Director, Engineering; WSO2 Inc.
http://www.wso2.com/ - "The Open Source SOA Company"
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]