Hi Gordon, I'll have a look. Can you create a jira and attach the patch and the test code?
Regards, Shankar On Mon, Jun 8, 2009 at 11:40 PM, Gordon Brown<gordonw.br...@yahoo.com> wrote: > Can anyone in the development team please take a look at this one bug in > Guththila component? > At least the potential fix I provided in this message thread? > > ====================== > The potential fix is to define GUTHTHILA_BUF_POS as the following: > > if ((_buffer)->pre_tot_data > _pos) > return ((_buffer)->buff[(_buffer)->cur_buff-1] + _pos); > else > return ((_buffer)->buff[(_buffer)->cur_buff] + _pos - > (_buffer)->pre_tot_data); > ====================== > It is a problem in the buffer management, so without fixing this bug, users > should not use guththila at this point. > > Thanks! > Gordon > ________________________________ > From: Gordon Brown <gordonw.br...@yahoo.com> > To: axis-c-dev@ws.apache.org; shan...@wso2.com; sam...@wso2.com > Cc: axis-c-u...@ws.apache.org > Sent: Friday, June 5, 2009 2:15:42 PM > Subject: Re: soap in client call contains gabage character -- A critical bug > in guththila writer > > OK, since no one reply to my question, I have to debug the code and found > out that guththila has a bug in managing buffer when seriazlize thea axiom > tree (the soap structure) before actually send out the request, and I have a > potential fix. This is really a critical bug I think, so I hope some > developers can take a look at this problem. I am attaching the test > input data and code snappet to reproduce the problem. > > Basically, the bug occurs in guththila_xml_writer.c. > The guththila_xml_writer (I call it the soap serializer) maintains an array > of buffers dynamically when it writes the soap structure into the buffers. > The bug will occur in the following situation: > > Let's say I have an element <ns1:doDeleteFirst>12345</ns1:doDeleteFirst> > somewhere in the soap structure. Now before this element, there are lots of > other elements, and when the guththila_xml_writer trys to process this > element, the first buffer is ALMOST full, it does not have enough space > to write the whole element name <ns1:doDeleteFirst> (the start tag) into the > buffer, it has to create a new buffer, so it writes <ns1: at the end of the > first buffer (still a few more bytes left empty), and writes "doDeleteFirst" > at the very beginning of the second buffer. > > The first buffer (Buffer length 16384): > -------------------------------------------------------------------------- > |**************************************************<ns1:--| > > The second buffer (Buffer length 32768): > --------------------------------------------------------------------------------------------------------------------------- > |doDeleteFirst-------------------------------------------------------------------------------------------------------------| > > As the second buffer becomes the current buffer, when the writer trys to > process the end tag (</ns1:doDeleteFirst>), it uses an elem stack to track > the namespace prefix and localname as in the following code: (starting from > line 1396) > > > elem->name = guththila_tok_list_get_token(&wr->tok_list, env); > > elem->prefix = guththila_tok_list_get_token(&wr->tok_list, env); > > elem->name->start = GUTHTHILA_BUF_POS(wr->buffer, elem_start); > > elem->name->size = elem_len; > > elem->prefix->start = GUTHTHILA_BUF_POS(wr->buffer, > elem_pref_start); > > elem->prefix->size = pref_len; > > > The macro GUTHTHILA_BUF_POS is defined as this: > > #ifndef GUTHTHILA_BUF_POS > #define GUTHTHILA_BUF_POS(_buffer, _pos) > ((_buffer).buff[(_buffer).cur_buff] + _pos - (_buffer).pre_tot_data) > #endif > The bug occurs when it calcuate elem->prefix->start = > GUTHTHILA_BUF_POS(wr->buffer, elem_pref_start): > > The elem_pref_start has a value of 16375, the pre_tot_data has a value of > 16379 (the first buffer length is 16384), they are calculated based on the > first buffer data, but the current buffer is the second one, so > elem->prefix->start points to gabage! > > I hope this makes sense to you. Use my test case you will see this quickly. > When you run the same XML data I attached, first set a break point at line > 392 in the file guththila_xml_writer_wrapper, and set the hit count as 514 > in the break properties (the 514th element in <ns1:doDeleteFirst>), then > debug step by step. > > The potential fix is to define GUTHTHILA_BUF_POS as the following: > > if ((_buffer)->pre_tot_data > _pos) > return ((_buffer)->buff[(_buffer)->cur_buff-1] + _pos); > else > return ((_buffer)->buff[(_buffer)->cur_buff] + _pos - > (_buffer)->pre_tot_data); > GUTHTHILA_BUF_POS is used everywhere, so I really hope some developer can > take over this case and fix it! > > Thanks! > Gordon > > ________________________________ > From: Gordon Brown <gordonw.br...@yahoo.com> > To: axis-c-u...@ws.apache.org > Cc: axis-c-dev@ws.apache.org > Sent: Wednesday, June 3, 2009 12:49:21 AM > Subject: soap in client call contains gabage character -- Very very puzzling > > Hi All, > > I need urgent help with a very puzzling issue with axis2/c 1.6 ( I build the > axis2/c using the code from trunk, slightly earlier before the offical > release). Here is my issue: > > I have a small XML data (16K) passed in to be as a UTF8 string, I checked > the XML data is good (run through quite a few other tools to verify it). Now > I used axiom APIs to parse the XML and make web service call like this: > > ========= > > xml_reader = axiom_xml_reader_create_for_memory(_env, ( > > void*)xmlString_in.c_str(), xmlString_in.size(), "utf-8", > > AXIS2_XML_PARSER_TYPE_BUFFER); > > > > om_builder = axiom_stax_builder_create(_env, xml_reader); > > > > axiom_document_t *document = axiom_stax_builder_get_document(om_builder, > _env); > > > > axiom_node_t * payload = axiom_document_get_root_element(document, _env); > > > > ......... > > > > axiom_node_t * node = axis2_svc_client_send_receive(_wsf_service_client, > _env, payload ); > > ============ > > > > Now I use tcpmon to intercept the call, I noticed that the data sent out > contains some gabage characters (always in some XML tag, not the element > value) like this: > > > > <ns1:doDeleteFirst>12345</ù:doDeleteFirst> > > > > However, if I serialize the payload node before I make the client call, I > can see the data is fine in memory. What puzzles me even more is that this > thing only occur in one XML file I tried, but works fine for many other XML > input (even as big as 10M bytes). I've also attached the XML I used to > procude the problem. > > > > Does anyone have a clue about this? > > > > Thanks much in advance! > > Gordon > > > > > -- S.Uthaiyashankar Software Architect WSO2 Inc. http://wso2.com/ - "The Open Source SOA Company"