[magnolia-dev] OOME during large activation: ho w to get chunked transfer coding?

Jörg von Frantzius Mon, 15 Nov 2010 08:59:13 -0800

Hi,

we received a heap dump from our client, where there is a thread holding 260GB of memory while trying to activating some seemingly large content:

at java.io.FileInputStream.readBytes([BII)I (Native Method)
  at java.io.FileInputStream.read([B)I (FileInputStream.java:177)
  at org.apache.commons.io.IOUtils.copyLarge(Ljava/io/InputStream;Ljava/io/OutputStream;)J (IOUtils.java:1025)
  at org.apache.commons.io.IOUtils.copy(Ljava/io/InputStream;Ljava/io/OutputStream;)I (IOUtils.java:999)
  at info.magnolia.module.exchangesimple.Transporter.transport(Ljava/net/HttpURLConnection;Linfo/magnolia/module/exchangesimple/ActivationContent;)V (Transporter.java:134)
  at info.magnolia.module.exchangesimple.SimpleSyndicator.activate(Linfo/magnolia/cms/exchange/Subscriber;Linfo/magnolia/module/exchangesimple/ActivationContent;)Ljava/lang/String; (SimpleSyndicator.java:173)
  at info.magnolia.module.exchangesimple.SimpleSyndicator$2.run()V (SimpleSyndicator.java:120)
  at EDU.oswego.cs.dl.util.concurrent.PooledExecutor$Worker.run()V (Unknown Source)
  at java.lang.Thread.run()V (Thread.java:662)

Accumulated Objects
Class Name Shallow Heap Retained Heap Percentage
java.lang.Thread @ 0x7fef5c1cb820 Thread-8694

176 268.476.352 33,87%

sun.net.www.http.PosterOutputStream @ 0x7fef64353bc8

40 268.435.520 33,87%
byte[268435456]
                    @ 0x7fef78870000
                    --mgnlExchange-cfc93688d385..content-disposition:
                    form-data;
                    name="exchange_3d9b7a32-16b6-493b-a80c-77374632719f1252654589289893304.xml.gz";

                    filename="exchange_3d9b7a32-16b6-493b-a80c-77374632719f1252654589289893304.xml.gz"..content-type:

                    application/octet...
268.435.480

268.435.480

33,87%

It seems that PosterOutputStream, being a ByteArrayOutputStream by inheritance, will buffer the whole activation request in memory. In addition, by looking at the Magnolia code, it seems that for a single activation, this will happen once for every subscriber. So that's going to put a lot of load on the GC when a single large binary is activated, or even worse, when several users are simultaneously activating large binaries.

There has been a similar discussion here earlier: http://www.mail-archive.com/user-l...@magnolia-cms.com/msg01809.html, where Jan was wondering why a ByteArrayOutputStream was used by java.net.Connection, resulting in the OOME for large activations. The problem could be avoided if "chunked transfer coding" was used during the activation requests from author to public servers.

I think it would be really great if chunking was used reliably, since currently system stability will become at danger with large binaries. So I started digging a bit.

There is a method java.net.HttpURLConnection.setChunkedStreamingMode(int) to enable chunking, and to me it seems that this needs explicit invocation in order to get chunking to happen, i.e. I don't think that chunking can be enabled by means of some configuration. The method's javadoc says that the request could fail if the server does not support chunking. RFC 2616, on the other hand, says that 'All HTTP/1.1 applications MUST be able to receive and decode the "chunked" transfer-coding'.

So if the Magnolia code would always invoke HttpURLConnection.setChunkedStreamingMode(int), this would only add the requirement of a HTTP/1.1 capable server being used for the public instances. I'd think that this shouldn't be a big problem? Alternatively, there could be a fallback to non-chunked mode in case of failure.

What do you think of this improvement?

Regards,
Jörg

--
Dipl. inf. Jörg von Frantzius, System Architect
Email mailto:joerg.frantz...@aperto.de
Phone +49 30 283921-318
Fax +49 30 283921-29
Aperto AG - In der Pianofabrik
Chausseestraße 5, D-10115 Berlin-Mitte
Web http://www.aperto.de
HRB 77049, AG Berlin Charlottenburg
Vorstand: Dirk Buddensiek

----------------------------------------------------------------
For list details see
http://www.magnolia-cms.com/home/community/mailing-lists.html
To unsubscribe, E-mail to: <dev-list-unsubscr...@magnolia-cms.com>
----------------------------------------------------------------

[magnolia-dev] OOME during large activation: ho w to get chunked transfer coding?

Accumulated Objects

Reply via email to