Using the last stable build from 15 March 2009 I still get exactly same behaviour as originally described with the above script. VFS still just dies. Would your fixes be in this ?
Using the last st Andreas Veithen-2 wrote: > > I committed the code and it will be available in the next WS-Commons > transport build. The methods are located in > org.apache.axis2.format.ElementHelper in the axis2-transport-base > module. > > Andreas > > On Thu, Mar 12, 2009 at 00:06, Kim Horn <[email protected]> wrote: >> Hello Andreas, >> This is great and really helps, have not had time to try it out but will >> soon. >> >> Contributing the java.io.Reader would be a great help but it will take me >> a while to get up to speed to do the Synapse iterator. >> >> In the short term I am going to use a brute force approach that is now >> feasible given the memory issue is resolved. Just thought of this one >> today. Use VFS proxy to FTP file locally; so streaming helps here. A >> POJOCommand on <out> to split file into another directory, stream in and >> out. Another independent VFS proxy watches that directory and submits >> each file to Web service. Hopefully memory will be fine. Overloading the >> destination may still be an issue ? >> >> Kim >> >> >> >> -----Original Message----- >> From: Andreas Veithen [mailto:[email protected]] >> Sent: Monday, 9 March 2009 10:55 PM >> To: [email protected] >> Subject: Re: VFS - Synapse Memory Leak >> >> The changes I did in the VFS transport and the message builders for >> text/plain and application/octet-stream certainly don't provide an >> out-of-the-box solution for your use case, but they are the >> prerequisite. >> >> Concerning your first proposed solution (let the VFS write the content >> to a temporary file), I don't like this because it would create a >> tight coupling between the VFS transport and the mediator. A design >> goal should be that the solution will still work if the file comes >> from another source, e.g. an attachment in an MTOM or SwA message. >> >> I thing that an all-Synapse solution (2 or 3) should be possible, but >> this will require development of a custom mediator. This mediator >> would read the content, split it up (and store the chunks in memory or >> an disk) and executes a sub-sequence for each chunk. The execution of >> the sub-sequence would happen synchronously to limit the memory/disk >> space consumption (to the maximum chunk size) and to avoid flooding >> the destination service. >> >> Note that it is probably not possible to implemented the mediator >> using a script because of the problematic String handling. Also, >> Spring, POJO and class mediators don't support sub-sequences (I >> think). Therefore it should be implemented as a full-featured Java >> mediator, probably taking the existing iterate mediator as a template. >> I can contribute the required code to get the text content in the form >> of a java.io.Reader. >> >> Regards, >> >> Andreas >> >> On Mon, Mar 9, 2009 at 03:05, kimhorn <[email protected]> wrote: >>> >>> Although this is a good feature it may not solve the actual problem ? >>> The main first issue on my list was the memory leak. >>> However, the real problem is once I get this massive files I have to >>> send >>> it to a web Service that can only take it in small chunks (about 14MB) . >>> Streaming it straight out would just kill the destination Web service. >>> It >>> would get the memory error. The text document can be split apart easily, >>> as >>> it has independant records on each line seperated by <CR> <LF>. >>> >>> In an earlier post; that was not responded too, I mentioned: >>> >>> "Otherwise; for large EDI files a VFS iterator Mediator that streams >>> through >>> input file and outputs smaller >>> chunks for processing, in Synapse, may be a solution ? " >>> >>> So I had mentioned a few solutions, in prior posts, solution now are: >>> >>> 1) VFS writes straight to temporary file, then a Java mediator can >>> process >>> the file by splitting it into many smaller files. These files then >>> trigger >>> another VFS proxy that submits these to the final web Service. >>> The problem is is that is uses the file system (not so bad). >>> 2) A Java Mediator takes the <text> package and splits it up by wrapping >>> into many XML <data> elements that can then be acted on by a Synapse >>> Iterator. So replace the text message with many smaller XML elements. >>> Problem is that this loads whole message into memory. >>> 3) Create another Iterator in Synapse that works on Regular expression >>> (to >>> split the text data) or actually uses a for loop approach to chop the >>> file >>> into chunks based on the loop index value. E.g. Index = 23 means a 14K >>> chunk >>> 23 chunks into the data. >>> 4) Using the approach proposed now - just submit the file straight >>> (stream >>> it) to another web service that chops it up. It may return an XML >>> document >>> with many sub elelements that allows the standard Iterator to work. >>> Similar >>> to (2) but using another service rather than Java to split document. >>> 5) Using the approach proposed now - just submit the file straight >>> (stream >>> it) to another web service that chops it up but calls a Synapse proxy >>> with >>> each small packet of data that then forwards it to the final WEb >>> Service. So >>> the Web Service iterates across the data; and not Synapse. >>> >>> Then other solutions replace Synapse with a stand alone Java program at >>> the >>> front end. >>> >>> Another issue here is throttling: Splitting the file is one issues but >>> submitting 100's of calls in parralel to the destination service would >>> result in time outs... So need to work in throttling. >>> >>> >>> >>> >>> >>> >>> >>> >>> Ruwan Linton wrote: >>>> >>>> I agree and can understand the time factor and also +1 for reusing >>>> stuff >>>> than trying to invent the wheel again :-) >>>> >>>> Thanks, >>>> Ruwan >>>> >>>> On Sun, Mar 8, 2009 at 4:08 PM, Andreas Veithen >>>> <[email protected]>wrote: >>>> >>>>> Ruwan, >>>>> >>>>> It's not a question of possibility, it is a question of available time >>>>> :-) >>>>> >>>>> Also note that some of the features that we might want to implement >>>>> have some similarities with what is done for attachments in Axiom >>>>> (except that an attachment is only available once, while a file over >>>>> VFS can be read several times). I think there is also some existing >>>>> code in Axis2 that might be useful. We should not reimplement these >>>>> things but try to make the existing code reusable. This however is >>>>> only realistic for the next release after 1.3. >>>>> >>>>> Andreas >>>>> >>>>> On Sun, Mar 8, 2009 at 03:47, Ruwan Linton <[email protected]> >>>>> wrote: >>>>> > Andreas, >>>>> > >>>>> > Can we have the caching at the file system as a property to support >>>>> the >>>>> > multiple layers touching the full message and is it possible make it >>>>> to >>>>> > specify a threshold for streaming? For example if the message is >>>>> touched >>>>> > several time we might still need streaming but not for the 100KB or >>>>> lesser >>>>> > files. >>>>> > >>>>> > Thanks, >>>>> > Ruwan >>>>> > >>>>> > On Sun, Mar 8, 2009 at 1:12 AM, Andreas Veithen < >>>>> [email protected]> >>>>> > wrote: >>>>> >> >>>>> >> I've done an initial implementation of this feature. It is >>>>> available >>>>> >> in trunk and should be included in the next nightly build. In order >>>>> to >>>>> >> enable this in your configuration, you need to add the following >>>>> >> property to the proxy: >>>>> >> >>>>> >> <parameter name="transport.vfs.Streaming">true</parameter> >>>>> >> >>>>> >> You also need to add the following mediators just before the <send> >>>>> >> mediator: >>>>> >> >>>>> >> <property action="remove" name="transportNonBlocking" >>>>> scope="axis2"/> >>>>> >> <property action="set" name="OUT_ONLY" value="true"/> >>>>> >> >>>>> >> With this configuration Synapse will stream the data directly from >>>>> the >>>>> >> incoming to the outgoing transport without storing it in memory or >>>>> in >>>>> >> a temporary file. Note that this has two other side effects: >>>>> >> * The incoming file (or connection in case of a remote file) will >>>>> only >>>>> >> be opened on demand. In this case this happens during execution of >>>>> the >>>>> >> <send> mediator. >>>>> >> * If during the mediation the content of the file is needed several >>>>> >> time (which is not the case in your example), it will be read >>>>> several >>>>> >> times. The reason is of course that the content is not cached. >>>>> >> >>>>> >> I tested the solution with a 2GB file and it worked fine. The >>>>> >> performance of the implementation is not yet optimal, but at least >>>>> the >>>>> >> memory consumption is constant. >>>>> >> >>>>> >> Some additional comments: >>>>> >> * The transport.vfs.Streaming property has no impact on XML and >>>>> SOAP >>>>> >> processing: this type of content is processed exactly as before. >>>>> >> * With the changes described here, we have now two different >>>>> policies >>>>> >> for plain text and binary content processing: in-memory caching + >>>>> no >>>>> >> streaming (transport.vfs.Streaming=false) and no caching + deferred >>>>> >> connection + streaming (transport.vfs.Streaming=true). Probably we >>>>> >> should define a wider range of policies in the future, including >>>>> file >>>>> >> system caching + streaming. >>>>> >> * It is necessary to remove the transportNonBlocking property >>>>> >> (MessageContext.TRANSPORT_NON_BLOCKING) to prevent the <send> >>>>> mediator >>>>> >> (more precisely the OperationClient) from executing the outgoing >>>>> >> transport in a separate thread. This property is set by the >>>>> incoming >>>>> >> transport. I think this is a bug since I don't see any valid reason >>>>> >> why the transport that handles the incoming request should >>>>> determine >>>>> >> the threading behavior of the transport that sends the outgoing >>>>> >> request to the target service. Maybe Asankha can comment on this? >>>>> >> >>>>> >> Andreas >>>>> >> >>>>> >> On Thu, Mar 5, 2009 at 07:21, kimhorn <[email protected]> >>>>> wrote: >>>>> >> > >>>>> >> > Thats good; as this stops us using Synapse. >>>>> >> > >>>>> >> > >>>>> >> > >>>>> >> > Asankha C. Perera wrote: >>>>> >> >> >>>>> >> >> >>>>> >> >>> Exception in thread "vfs-Worker-4" java.lang.OutOfMemoryError: >>>>> Java >>>>> >> >>> heap >>>>> >> >>> space >>>>> >> >>> at >>>>> >> >>> >>>>> >> >>> >>>>> java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:99) >>>>> >> >>> at >>>>> >> >>> >>>>> java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:518) >>>>> >> >>> at java.lang.StringBuffer.append(StringBuffer.java:307) >>>>> >> >>> at java.io.StringWriter.write(StringWriter.java:72) >>>>> >> >>> at >>>>> org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:1129) >>>>> >> >>> at >>>>> org.apache.commons.io.IOUtils.copy(IOUtils.java:1104) >>>>> >> >>> at >>>>> org.apache.commons.io.IOUtils.copy(IOUtils.java:1078) >>>>> >> >>> at >>>>> org.apache.commons.io.IOUtils.toString(IOUtils.java:382) >>>>> >> >>> at >>>>> >> >>> >>>>> >> >>> >>>>> org.apache.synapse.format.PlainTextBuilder.processDocument(PlainTextBuilder.java:68) >>>>> >> >>> >>>>> >> >> Since the content type is text, the plain text formatter is >>>>> trying >>>>> to >>>>> >> >> use a String to parse as I see.. which is a problem for large >>>>> content.. >>>>> >> >> >>>>> >> >> A definite bug we need to fix .. >>>>> >> >> >>>>> >> >> cheers >>>>> >> >> asankha >>>>> >> >> >>>>> >> >> -- >>>>> >> >> Asankha C. Perera >>>>> >> >> AdroitLogic, http://adroitlogic.org >>>>> >> >> >>>>> >> >> http://esbmagic.blogspot.com >>>>> >> >> >>>>> >> >> >>>>> >> >> >>>>> >> >> >>>>> >> >> >>>>> >> >> >>>>> --------------------------------------------------------------------- >>>>> >> >> To unsubscribe, e-mail: [email protected] >>>>> >> >> For additional commands, e-mail: [email protected] >>>>> >> >> >>>>> >> >> >>>>> >> >> >>>>> >> > >>>>> >> > -- >>>>> >> > View this message in context: >>>>> >> > >>>>> http://www.nabble.com/VFS---Synapse-Memory-Leak-tp22344176p22345904.html >>>>> >> > Sent from the Synapse - Dev mailing list archive at Nabble.com. >>>>> >> > >>>>> >> > >>>>> >> > >>>>> --------------------------------------------------------------------- >>>>> >> > To unsubscribe, e-mail: [email protected] >>>>> >> > For additional commands, e-mail: [email protected] >>>>> >> > >>>>> >> > >>>>> >> >>>>> >> >>>>> --------------------------------------------------------------------- >>>>> >> To unsubscribe, e-mail: [email protected] >>>>> >> For additional commands, e-mail: [email protected] >>>>> >> >>>>> > >>>>> > >>>>> > >>>>> > -- >>>>> > Ruwan Linton >>>>> > http://wso2.org - "Oxygenating the Web Services Platform" >>>>> > http://ruwansblog.blogspot.com/ >>>>> > >>>>> >>>>> --------------------------------------------------------------------- >>>>> To unsubscribe, e-mail: [email protected] >>>>> For additional commands, e-mail: [email protected] >>>>> >>>>> >>>> >>>> >>>> -- >>>> Ruwan Linton >>>> http://wso2.org - "Oxygenating the Web Services Platform" >>>> http://ruwansblog.blogspot.com/ >>>> >>>> >>> >>> -- >>> View this message in context: >>> http://www.nabble.com/VFS---Synapse-Memory-Leak-tp22344176p22405973.html >>> Sent from the Synapse - Dev mailing list archive at Nabble.com. >>> >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: [email protected] >>> For additional commands, e-mail: [email protected] >>> >>> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: [email protected] >> For additional commands, e-mail: [email protected] >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: [email protected] >> For additional commands, e-mail: [email protected] >> >> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > > > -- View this message in context: http://www.nabble.com/VFS---Synapse-Memory-Leak-tp22344176p22594321.html Sent from the Synapse - Dev mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
