Jan, didn't we look into chunking once in the past, and had problems with it ? Or was it not in relation with activation ? Otherwise, I guess the improvement's fine, as long as it can be turned off, in case someone runs activation through a proxy or some odd http server ?
-g On Nov 15, 2010, at 17:57, Jörg von Frantzius wrote: > Hi, > > we received a heap dump from our client, where there is a thread holding > 260GB of memory while trying to activating some seemingly large content: > at java.io.FileInputStream.readBytes([BII)I (Native Method) > at java.io.FileInputStream.read([B)I (FileInputStream.java:177) > at > org.apache.commons.io.IOUtils.copyLarge(Ljava/io/InputStream;Ljava/io/OutputStream;)J > (IOUtils.java:1025) > at > org.apache.commons.io.IOUtils.copy(Ljava/io/InputStream;Ljava/io/OutputStream;)I > (IOUtils.java:999) > at > info.magnolia.module.exchangesimple.Transporter.transport(Ljava/net/HttpURLConnection;Linfo/magnolia/module/exchangesimple/ActivationContent;)V > (Transporter.java:134) > at > info.magnolia.module.exchangesimple.SimpleSyndicator.activate(Linfo/magnolia/cms/exchange/Subscriber;Linfo/magnolia/module/exchangesimple/ActivationContent;)Ljava/lang/String; > (SimpleSyndicator.java:173) > at info.magnolia.module.exchangesimple.SimpleSyndicator$2.run()V > (SimpleSyndicator.java:120) > at EDU.oswego.cs.dl.util.concurrent.PooledExecutor$Worker.run()V (Unknown > Source) > at java.lang.Thread.run()V (Thread.java:662) > > <opened.gif> Accumulated Objects > > Class Name Shallow Heap Retained Heap Percentage > <i3.gif> > • java.lang.Thread @ 0x7fef5c1cb820 Thread-8694 > 176 268.476.352 33,87% > <corner.gif><i5.gif> > • sun.net.www.http.PosterOutputStream @ 0x7fef64353bc8 > 40 268.435.520 33,87% > <empty.gif><corner.gif><i15.gif> > • byte[268435456] @ 0x7fef78870000 > --mgnlExchange-cfc93688d385..content-disposition: form-data; > name="exchange_3d9b7a32-16b6-493b-a80c-77374632719f1252654589289893304.xml.gz"; > > filename="exchange_3d9b7a32-16b6-493b-a80c-77374632719f1252654589289893304.xml.gz"..content-type: > application/octet... > 268.435.480 268.435.480 33,87% > It seems that PosterOutputStream, being a ByteArrayOutputStream by > inheritance, will buffer the whole activation request in memory. In addition, > by looking at the Magnolia code, it seems that for a single activation, this > will happen once for every subscriber. So that's going to put a lot of load > on the GC when a single large binary is activated, or even worse, when > several users are simultaneously activating large binaries. > > There has been a similar discussion here earlier: > http://www.mail-archive.com/user-l...@magnolia-cms.com/msg01809.html, where > Jan was wondering why a ByteArrayOutputStream was used by > java.net.Connection, resulting in the OOME for large activations. The problem > could be avoided if "chunked transfer coding" was used during the activation > requests from author to public servers. > > I think it would be really great if chunking was used reliably, since > currently system stability will become at danger with large binaries. So I > started digging a bit. > > There is a method java.net.HttpURLConnection.setChunkedStreamingMode(int) to > enable chunking, and to me it seems that this needs explicit invocation in > order to get chunking to happen, i.e. I don't think that chunking can be > enabled by means of some configuration. The method's javadoc says that the > request could fail if the server does not support chunking. RFC 2616, on the > other hand, says that 'All HTTP/1.1 applications MUST be able to receive and > decode the "chunked" transfer-coding'. > > So if the Magnolia code would always invoke > HttpURLConnection.setChunkedStreamingMode(int), this would only add the > requirement of a HTTP/1.1 capable server being used for the public instances. > I'd think that this shouldn't be a big problem? Alternatively, there could be > a fallback to non-chunked mode in case of failure. > > What do you think of this improvement? > > Regards, > Jörg > > > -- > Dipl. inf. Jörg von Frantzius, System Architect > Email mailto:joerg.frantz...@aperto.de > Phone +49 30 283921-318 > Fax +49 30 283921-29 > Aperto AG - In der Pianofabrik > Chausseestraße 5, D-10115 Berlin-Mitte > Web http://www.aperto.de > HRB 77049, AG Berlin Charlottenburg > Vorstand: Dirk Buddensiek > > > ---------------------------------------------------------------- > For list details see > http://www.magnolia-cms.com/home/community/mailing-lists.html > To unsubscribe, E-mail to: <dev-list-unsubscr...@magnolia-cms.com> > ---------------------------------------------------------------- ---------------------------------------------------------------- For list details see http://www.magnolia-cms.com/home/community/mailing-lists.html To unsubscribe, E-mail to: <dev-list-unsubscr...@magnolia-cms.com> ----------------------------------------------------------------