Of course the memory allocated to a message will be freed once the message has been processed. That is why it's important to set the OUT_ONLY property: if it is not set correctly, Synapse will keep the message context (with the payload) in a callback table to correlate it with a future response (which in your case never comes in). Probably there is something to improve here in Synapse: - The VFS transport should trigger an error if there is a mismatch between the message exchange pattern and the transport configuration of the service (the transport.vfs.* parameters). - Synapse should start issuing warnings when the number of entries in the callback table reaches a certain threshold.
Andreas On Fri, Mar 20, 2009 at 01:41, Kim Horn <[email protected]> wrote: > Not really; I cannot see why memory should permanently grow when I pass the > same file > repeatedly through VFS. In theory this means VFS will always consume all the > available memory > given enough time and file iterations. Therefore VFS cannot be used in a > production system. > This is definition of Memory Leak. I would expect SOME overhead on top of > file size but > I would assume the memory no longer required would be re-claimed. I would > also assume > The overhead was not 10 times the file size; seems excessive. > > Yes I understand the streaming approach should in theory use a fixed and much > smaller amount of memory; > but haven't tested that yet either. No reason given above memory leak that it > should not permanently grow > but at a smaller rate aswell. > > Thanks > Kim > > -----Original Message----- > From: Andreas Veithen [mailto:[email protected]] > Sent: Friday, 20 March 2009 10:52 AM > To: [email protected] > Subject: Re: VFS - Synapse Memory Leak > > If N is the size of the file, the memory consumption caused by the > transport is O(N) with transport.vfs.Streaming=false and O(1) with > transport.vfs.Streaming=true. The getTextAsStream and writeTextTo > methods in org.apache.axis2.format.ElementHelper are there to allow > you to implement your mediator with O(1) memory usage, so that the > overall memory consumption remains O(1). Does that answer your > question? > > Andreas > > On Thu, Mar 19, 2009 at 23:33, Kim Horn <[email protected]> wrote: >> It's the same Synapse.xml as specified originally and same trace. If you are >> using Nabble you can see this, in case you lost the prior emails I can post >> them again. >> >> I must admit I did not set those extra parameters, you mentioned, but I >> don't see why you should set parameter to Stop a memory leak. I guessed >> these parameter would just reduce the large amounts of memory it appears to >> be using, e.g. 10 times the file size, via streaming ? Why is their 10 >> copies of the data floating around ? Lots of buffering. This issue suggests >> to me that any use of VFS will eventually kill the Server. Even with smaller >> files it will eventually use all available memory. I guess I did not >> understand the actual reason for this issue from prior discussion. >> >> I will try your extra parameters today though. >> >> Thanks >> Kim >> >> >> -----Original Message----- >> From: Andreas Veithen [mailto:[email protected]] >> Sent: Thursday, 19 March 2009 5:48 PM >> To: [email protected] >> Subject: Re: VFS - Synapse Memory Leak >> >> Kim, >> >> Can you post your current synapse.xml as well as the stack trace you get now? >> >> Andreas >> >> On Thu, Mar 19, 2009 at 07:20, kimhorn <[email protected]> wrote: >>> >>> Using the last stable build from 15 March 2009 I still get exactly same >>> behaviour as originally >>> described with the above script. VFS still just dies. Would your fixes be in >>> this ? >>> >>> Using the last st >>> >>> Andreas Veithen-2 wrote: >>>> >>>> I committed the code and it will be available in the next WS-Commons >>>> transport build. The methods are located in >>>> org.apache.axis2.format.ElementHelper in the axis2-transport-base >>>> module. >>>> >>>> Andreas >>>> >>>> On Thu, Mar 12, 2009 at 00:06, Kim Horn <[email protected]> wrote: >>>>> Hello Andreas, >>>>> This is great and really helps, have not had time to try it out but will >>>>> soon. >>>>> >>>>> Contributing the java.io.Reader would be a great help but it will take me >>>>> a while to get up to speed to do the Synapse iterator. >>>>> >>>>> In the short term I am going to use a brute force approach that is now >>>>> feasible given the memory issue is resolved. Just thought of this one >>>>> today. Use VFS proxy to FTP file locally; so streaming helps here. A >>>>> POJOCommand on <out> to split file into another directory, stream in and >>>>> out. Another independent VFS proxy watches that directory and submits >>>>> each file to Web service. Hopefully memory will be fine. Overloading the >>>>> destination may still be an issue ? >>>>> >>>>> Kim >>>>> >>>>> >>>>> >>>>> -----Original Message----- >>>>> From: Andreas Veithen [mailto:[email protected]] >>>>> Sent: Monday, 9 March 2009 10:55 PM >>>>> To: [email protected] >>>>> Subject: Re: VFS - Synapse Memory Leak >>>>> >>>>> The changes I did in the VFS transport and the message builders for >>>>> text/plain and application/octet-stream certainly don't provide an >>>>> out-of-the-box solution for your use case, but they are the >>>>> prerequisite. >>>>> >>>>> Concerning your first proposed solution (let the VFS write the content >>>>> to a temporary file), I don't like this because it would create a >>>>> tight coupling between the VFS transport and the mediator. A design >>>>> goal should be that the solution will still work if the file comes >>>>> from another source, e.g. an attachment in an MTOM or SwA message. >>>>> >>>>> I thing that an all-Synapse solution (2 or 3) should be possible, but >>>>> this will require development of a custom mediator. This mediator >>>>> would read the content, split it up (and store the chunks in memory or >>>>> an disk) and executes a sub-sequence for each chunk. The execution of >>>>> the sub-sequence would happen synchronously to limit the memory/disk >>>>> space consumption (to the maximum chunk size) and to avoid flooding >>>>> the destination service. >>>>> >>>>> Note that it is probably not possible to implemented the mediator >>>>> using a script because of the problematic String handling. Also, >>>>> Spring, POJO and class mediators don't support sub-sequences (I >>>>> think). Therefore it should be implemented as a full-featured Java >>>>> mediator, probably taking the existing iterate mediator as a template. >>>>> I can contribute the required code to get the text content in the form >>>>> of a java.io.Reader. >>>>> >>>>> Regards, >>>>> >>>>> Andreas >>>>> >>>>> On Mon, Mar 9, 2009 at 03:05, kimhorn <[email protected]> wrote: >>>>>> >>>>>> Although this is a good feature it may not solve the actual problem ? >>>>>> The main first issue on my list was the memory leak. >>>>>> However, the real problem is once I get this massive files I have to >>>>>> send >>>>>> it to a web Service that can only take it in small chunks (about 14MB) . >>>>>> Streaming it straight out would just kill the destination Web service. >>>>>> It >>>>>> would get the memory error. The text document can be split apart easily, >>>>>> as >>>>>> it has independant records on each line seperated by <CR> <LF>. >>>>>> >>>>>> In an earlier post; that was not responded too, I mentioned: >>>>>> >>>>>> "Otherwise; for large EDI files a VFS iterator Mediator that streams >>>>>> through >>>>>> input file and outputs smaller >>>>>> chunks for processing, in Synapse, may be a solution ? " >>>>>> >>>>>> So I had mentioned a few solutions, in prior posts, solution now are: >>>>>> >>>>>> 1) VFS writes straight to temporary file, then a Java mediator can >>>>>> process >>>>>> the file by splitting it into many smaller files. These files then >>>>>> trigger >>>>>> another VFS proxy that submits these to the final web Service. >>>>>> The problem is is that is uses the file system (not so bad). >>>>>> 2) A Java Mediator takes the <text> package and splits it up by wrapping >>>>>> into many XML <data> elements that can then be acted on by a Synapse >>>>>> Iterator. So replace the text message with many smaller XML elements. >>>>>> Problem is that this loads whole message into memory. >>>>>> 3) Create another Iterator in Synapse that works on Regular expression >>>>>> (to >>>>>> split the text data) or actually uses a for loop approach to chop the >>>>>> file >>>>>> into chunks based on the loop index value. E.g. Index = 23 means a 14K >>>>>> chunk >>>>>> 23 chunks into the data. >>>>>> 4) Using the approach proposed now - just submit the file straight >>>>>> (stream >>>>>> it) to another web service that chops it up. It may return an XML >>>>>> document >>>>>> with many sub elelements that allows the standard Iterator to work. >>>>>> Similar >>>>>> to (2) but using another service rather than Java to split document. >>>>>> 5) Using the approach proposed now - just submit the file straight >>>>>> (stream >>>>>> it) to another web service that chops it up but calls a Synapse proxy >>>>>> with >>>>>> each small packet of data that then forwards it to the final WEb >>>>>> Service. So >>>>>> the Web Service iterates across the data; and not Synapse. >>>>>> >>>>>> Then other solutions replace Synapse with a stand alone Java program at >>>>>> the >>>>>> front end. >>>>>> >>>>>> Another issue here is throttling: Splitting the file is one issues but >>>>>> submitting 100's of calls in parralel to the destination service would >>>>>> result in time outs... So need to work in throttling. >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> Ruwan Linton wrote: >>>>>>> >>>>>>> I agree and can understand the time factor and also +1 for reusing >>>>>>> stuff >>>>>>> than trying to invent the wheel again :-) >>>>>>> >>>>>>> Thanks, >>>>>>> Ruwan >>>>>>> >>>>>>> On Sun, Mar 8, 2009 at 4:08 PM, Andreas Veithen >>>>>>> <[email protected]>wrote: >>>>>>> >>>>>>>> Ruwan, >>>>>>>> >>>>>>>> It's not a question of possibility, it is a question of available time >>>>>>>> :-) >>>>>>>> >>>>>>>> Also note that some of the features that we might want to implement >>>>>>>> have some similarities with what is done for attachments in Axiom >>>>>>>> (except that an attachment is only available once, while a file over >>>>>>>> VFS can be read several times). I think there is also some existing >>>>>>>> code in Axis2 that might be useful. We should not reimplement these >>>>>>>> things but try to make the existing code reusable. This however is >>>>>>>> only realistic for the next release after 1.3. >>>>>>>> >>>>>>>> Andreas >>>>>>>> >>>>>>>> On Sun, Mar 8, 2009 at 03:47, Ruwan Linton <[email protected]> >>>>>>>> wrote: >>>>>>>> > Andreas, >>>>>>>> > >>>>>>>> > Can we have the caching at the file system as a property to support >>>>>>>> the >>>>>>>> > multiple layers touching the full message and is it possible make it >>>>>>>> to >>>>>>>> > specify a threshold for streaming? For example if the message is >>>>>>>> touched >>>>>>>> > several time we might still need streaming but not for the 100KB or >>>>>>>> lesser >>>>>>>> > files. >>>>>>>> > >>>>>>>> > Thanks, >>>>>>>> > Ruwan >>>>>>>> > >>>>>>>> > On Sun, Mar 8, 2009 at 1:12 AM, Andreas Veithen < >>>>>>>> [email protected]> >>>>>>>> > wrote: >>>>>>>> >> >>>>>>>> >> I've done an initial implementation of this feature. It is >>>>>>>> available >>>>>>>> >> in trunk and should be included in the next nightly build. In order >>>>>>>> to >>>>>>>> >> enable this in your configuration, you need to add the following >>>>>>>> >> property to the proxy: >>>>>>>> >> >>>>>>>> >> <parameter name="transport.vfs.Streaming">true</parameter> >>>>>>>> >> >>>>>>>> >> You also need to add the following mediators just before the <send> >>>>>>>> >> mediator: >>>>>>>> >> >>>>>>>> >> <property action="remove" name="transportNonBlocking" >>>>>>>> scope="axis2"/> >>>>>>>> >> <property action="set" name="OUT_ONLY" value="true"/> >>>>>>>> >> >>>>>>>> >> With this configuration Synapse will stream the data directly from >>>>>>>> the >>>>>>>> >> incoming to the outgoing transport without storing it in memory or >>>>>>>> in >>>>>>>> >> a temporary file. Note that this has two other side effects: >>>>>>>> >> * The incoming file (or connection in case of a remote file) will >>>>>>>> only >>>>>>>> >> be opened on demand. In this case this happens during execution of >>>>>>>> the >>>>>>>> >> <send> mediator. >>>>>>>> >> * If during the mediation the content of the file is needed several >>>>>>>> >> time (which is not the case in your example), it will be read >>>>>>>> several >>>>>>>> >> times. The reason is of course that the content is not cached. >>>>>>>> >> >>>>>>>> >> I tested the solution with a 2GB file and it worked fine. The >>>>>>>> >> performance of the implementation is not yet optimal, but at least >>>>>>>> the >>>>>>>> >> memory consumption is constant. >>>>>>>> >> >>>>>>>> >> Some additional comments: >>>>>>>> >> * The transport.vfs.Streaming property has no impact on XML and >>>>>>>> SOAP >>>>>>>> >> processing: this type of content is processed exactly as before. >>>>>>>> >> * With the changes described here, we have now two different >>>>>>>> policies >>>>>>>> >> for plain text and binary content processing: in-memory caching + >>>>>>>> no >>>>>>>> >> streaming (transport.vfs.Streaming=false) and no caching + deferred >>>>>>>> >> connection + streaming (transport.vfs.Streaming=true). Probably we >>>>>>>> >> should define a wider range of policies in the future, including >>>>>>>> file >>>>>>>> >> system caching + streaming. >>>>>>>> >> * It is necessary to remove the transportNonBlocking property >>>>>>>> >> (MessageContext.TRANSPORT_NON_BLOCKING) to prevent the <send> >>>>>>>> mediator >>>>>>>> >> (more precisely the OperationClient) from executing the outgoing >>>>>>>> >> transport in a separate thread. This property is set by the >>>>>>>> incoming >>>>>>>> >> transport. I think this is a bug since I don't see any valid reason >>>>>>>> >> why the transport that handles the incoming request should >>>>>>>> determine >>>>>>>> >> the threading behavior of the transport that sends the outgoing >>>>>>>> >> request to the target service. Maybe Asankha can comment on this? >>>>>>>> >> >>>>>>>> >> Andreas >>>>>>>> >> >>>>>>>> >> On Thu, Mar 5, 2009 at 07:21, kimhorn <[email protected]> >>>>>>>> wrote: >>>>>>>> >> > >>>>>>>> >> > Thats good; as this stops us using Synapse. >>>>>>>> >> > >>>>>>>> >> > >>>>>>>> >> > >>>>>>>> >> > Asankha C. Perera wrote: >>>>>>>> >> >> >>>>>>>> >> >> >>>>>>>> >> >>> Exception in thread "vfs-Worker-4" java.lang.OutOfMemoryError: >>>>>>>> Java >>>>>>>> >> >>> heap >>>>>>>> >> >>> space >>>>>>>> >> >>> at >>>>>>>> >> >>> >>>>>>>> >> >>> >>>>>>>> java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:99) >>>>>>>> >> >>> at >>>>>>>> >> >>> >>>>>>>> java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:518) >>>>>>>> >> >>> at java.lang.StringBuffer.append(StringBuffer.java:307) >>>>>>>> >> >>> at java.io.StringWriter.write(StringWriter.java:72) >>>>>>>> >> >>> at >>>>>>>> org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:1129) >>>>>>>> >> >>> at >>>>>>>> org.apache.commons.io.IOUtils.copy(IOUtils.java:1104) >>>>>>>> >> >>> at >>>>>>>> org.apache.commons.io.IOUtils.copy(IOUtils.java:1078) >>>>>>>> >> >>> at >>>>>>>> org.apache.commons.io.IOUtils.toString(IOUtils.java:382) >>>>>>>> >> >>> at >>>>>>>> >> >>> >>>>>>>> >> >>> >>>>>>>> org.apache.synapse.format.PlainTextBuilder.processDocument(PlainTextBuilder.java:68) >>>>>>>> >> >>> >>>>>>>> >> >> Since the content type is text, the plain text formatter is >>>>>>>> trying >>>>>>>> to >>>>>>>> >> >> use a String to parse as I see.. which is a problem for large >>>>>>>> content.. >>>>>>>> >> >> >>>>>>>> >> >> A definite bug we need to fix .. >>>>>>>> >> >> >>>>>>>> >> >> cheers >>>>>>>> >> >> asankha >>>>>>>> >> >> >>>>>>>> >> >> -- >>>>>>>> >> >> Asankha C. Perera >>>>>>>> >> >> AdroitLogic, http://adroitlogic.org >>>>>>>> >> >> >>>>>>>> >> >> http://esbmagic.blogspot.com >>>>>>>> >> >> >>>>>>>> >> >> >>>>>>>> >> >> >>>>>>>> >> >> >>>>>>>> >> >> >>>>>>>> >> >> >>>>>>>> --------------------------------------------------------------------- >>>>>>>> >> >> To unsubscribe, e-mail: [email protected] >>>>>>>> >> >> For additional commands, e-mail: [email protected] >>>>>>>> >> >> >>>>>>>> >> >> >>>>>>>> >> >> >>>>>>>> >> > >>>>>>>> >> > -- >>>>>>>> >> > View this message in context: >>>>>>>> >> > >>>>>>>> http://www.nabble.com/VFS---Synapse-Memory-Leak-tp22344176p22345904.html >>>>>>>> >> > Sent from the Synapse - Dev mailing list archive at Nabble.com. >>>>>>>> >> > >>>>>>>> >> > >>>>>>>> >> > >>>>>>>> --------------------------------------------------------------------- >>>>>>>> >> > To unsubscribe, e-mail: [email protected] >>>>>>>> >> > For additional commands, e-mail: [email protected] >>>>>>>> >> > >>>>>>>> >> > >>>>>>>> >> >>>>>>>> >> >>>>>>>> --------------------------------------------------------------------- >>>>>>>> >> To unsubscribe, e-mail: [email protected] >>>>>>>> >> For additional commands, e-mail: [email protected] >>>>>>>> >> >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > -- >>>>>>>> > Ruwan Linton >>>>>>>> > http://wso2.org - "Oxygenating the Web Services Platform" >>>>>>>> > http://ruwansblog.blogspot.com/ >>>>>>>> > >>>>>>>> >>>>>>>> --------------------------------------------------------------------- >>>>>>>> To unsubscribe, e-mail: [email protected] >>>>>>>> For additional commands, e-mail: [email protected] >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Ruwan Linton >>>>>>> http://wso2.org - "Oxygenating the Web Services Platform" >>>>>>> http://ruwansblog.blogspot.com/ >>>>>>> >>>>>>> >>>>>> >>>>>> -- >>>>>> View this message in context: >>>>>> http://www.nabble.com/VFS---Synapse-Memory-Leak-tp22344176p22405973.html >>>>>> Sent from the Synapse - Dev mailing list archive at Nabble.com. >>>>>> >>>>>> >>>>>> --------------------------------------------------------------------- >>>>>> To unsubscribe, e-mail: [email protected] >>>>>> For additional commands, e-mail: [email protected] >>>>>> >>>>>> >>>>> >>>>> --------------------------------------------------------------------- >>>>> To unsubscribe, e-mail: [email protected] >>>>> For additional commands, e-mail: [email protected] >>>>> >>>>> >>>>> --------------------------------------------------------------------- >>>>> To unsubscribe, e-mail: [email protected] >>>>> For additional commands, e-mail: [email protected] >>>>> >>>>> >>>> >>>> --------------------------------------------------------------------- >>>> To unsubscribe, e-mail: [email protected] >>>> For additional commands, e-mail: [email protected] >>>> >>>> >>>> >>> >>> -- >>> View this message in context: >>> http://www.nabble.com/VFS---Synapse-Memory-Leak-tp22344176p22594321.html >>> Sent from the Synapse - Dev mailing list archive at Nabble.com. >>> >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: [email protected] >>> For additional commands, e-mail: [email protected] >>> >>> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: [email protected] >> For additional commands, e-mail: [email protected] >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: [email protected] >> For additional commands, e-mail: [email protected] >> >> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > > --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
