[magnolia-dev] [JIRA] Updated: (MAGNOLIA-3390) Prevent OOME and GC load during activation of large data sets
[ http://jira.magnolia-cms.com/browse/MAGNOLIA-3390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jörg von Frantzius updated MAGNOLIA-3390: - Summary: Prevent OOME and GC load during activation of large data sets (was: Prevent OOME and GC load during activation of large binaries) > Prevent OOME and GC load during activation of large data sets > - > > Key: MAGNOLIA-3390 > URL: http://jira.magnolia-cms.com/browse/MAGNOLIA-3390 > Project: Magnolia > Issue Type: Improvement > Components: activation >Affects Versions: 4.3.8, 4.4.x, 5.0 >Reporter: Jörg von Frantzius >Assignee: Philipp Bärfuss > > During activation, the activated data is currently held completely in memory > for each activation request sent from author to public. > h4. The problem > When e.g. 250MB are activated on an author instance with 4 subscribers, these > 250MB are allocated 4 times in a row in RAM and garbage-collected afterwards. > Even if no OutOfMemoryError occurs during this, a high load is put on the > Garbage Collector, likely forcing the VM to perform "stop the world" full > collections, leading to unresponsiveness of the author instance for editors. > Given large enough binary data or simultaneous attempts at activating it, any > maximum heap size can be exceeded. > h4. Current implementation > This seems to be due to the default behaviour of > {{java.net.URLConnection.getOutputStream()}} used by > {{info.magnolia.module.exchangesimple.Transporter}}, which returns a subclass > of {{ByteArrayOutputStream}} that caches the whole GET request in memory. > This probably happens in order to determine the content-length before > actually sending the request. > h4. Proposed solution > The solution is to use "chunked transfer coding", as defined in [RFC > 2616|http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html#sec3.6.1]. This > needs to be explicitly enabled by calling > {{java.net.HttpURLConnection.setChunkedStreamingMode(int)}} prior to > {{getOutputStream()}}. I verified via debugger that doing so will result in a > {{sun.net.www.protocol.http.HttpURLConnection$StreamingOutputStream extends > FilterOutputStream}} instead of {{sun.net.www.http.PosterOutputStream extends > ByteArrayOutputStream}}. > Chunking requires the public server to be HTTP/1.1 compliant. In case > HTTP/1.1 compliance poses a problem e.g. with proxied public servers or weird > HTTP servers, chunking of activation requests should be configurable. There > could e.g. be a configuration NodeData > "server/activation/subscribers//useRequestChunking" with > default value "true". -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://jira.magnolia-cms.com/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira For list details see http://www.magnolia-cms.com/home/community/mailing-lists.html To unsubscribe, E-mail to:
[magnolia-dev] [JIRA] Updated: (MAGNOLIA-3390) Prevent OOME and GC load during activation of large data sets
[ http://jira.magnolia-cms.com/browse/MAGNOLIA-3390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Philipp Bärfuss updated MAGNOLIA-3390: -- Fix Version/s: 4.3.x 4.4.x Affects Version/s: 4.3.x (was: 4.3.8) (was: 4.4.x) (was: 5.0) Priority: Blocker (was: Major) Thanks for reporting and outlining this issue. We are going to change that for 4.4 and we will most likely backport it to 4.3.x. > Prevent OOME and GC load during activation of large data sets > - > > Key: MAGNOLIA-3390 > URL: http://jira.magnolia-cms.com/browse/MAGNOLIA-3390 > Project: Magnolia > Issue Type: Improvement > Components: activation >Affects Versions: 4.3.x >Reporter: Jörg von Frantzius >Assignee: Philipp Bärfuss >Priority: Blocker > Fix For: 4.3.x, 4.4.x > > > During activation, the activated data is currently held completely in memory > for each activation request sent from author to public. > h4. The problem > When e.g. 250MB are activated on an author instance with 4 subscribers, these > 250MB are allocated 4 times in a row in RAM and garbage-collected afterwards. > Even if no OutOfMemoryError occurs during this, a high load is put on the > Garbage Collector, likely forcing the VM to perform "stop the world" full > collections, leading to unresponsiveness of the author instance for editors. > Given large enough binary data or simultaneous attempts at activating it, any > maximum heap size can be exceeded. > h4. Current implementation > This seems to be due to the default behaviour of > {{java.net.URLConnection.getOutputStream()}} used by > {{info.magnolia.module.exchangesimple.Transporter}}, which returns a subclass > of {{ByteArrayOutputStream}} that caches the whole GET request in memory. > This probably happens in order to determine the content-length before > actually sending the request. > h4. Proposed solution > The solution is to use "chunked transfer coding", as defined in [RFC > 2616|http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html#sec3.6.1]. This > needs to be explicitly enabled by calling > {{java.net.HttpURLConnection.setChunkedStreamingMode(int)}} prior to > {{getOutputStream()}}. I verified via debugger that doing so will result in a > {{sun.net.www.protocol.http.HttpURLConnection$StreamingOutputStream extends > FilterOutputStream}} instead of {{sun.net.www.http.PosterOutputStream extends > ByteArrayOutputStream}}. > Chunking requires the public server to be HTTP/1.1 compliant. In case > HTTP/1.1 compliance poses a problem e.g. with proxied public servers or weird > HTTP servers, chunking of activation requests should be configurable. There > could e.g. be a configuration NodeData > "server/activation/subscribers//useRequestChunking" with > default value "true". -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://jira.magnolia-cms.com/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira For list details see http://www.magnolia-cms.com/home/community/mailing-lists.html To unsubscribe, E-mail to:
[magnolia-dev] [JIRA] Updated: (MAGNOLIA-3390) Prevent OOME and GC load during activation of large data sets
[ http://jira.magnolia-cms.com/browse/MAGNOLIA-3390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Philipp Bärfuss updated MAGNOLIA-3390: -- Fix Version/s: 4.4 Beta1 (was: 4.3.x) > Prevent OOME and GC load during activation of large data sets > - > > Key: MAGNOLIA-3390 > URL: http://jira.magnolia-cms.com/browse/MAGNOLIA-3390 > Project: Magnolia > Issue Type: Improvement > Components: activation >Affects Versions: 4.3.x >Reporter: Jörg von Frantzius >Assignee: Philipp Bärfuss >Priority: Blocker > Fix For: 4.4 Beta1 > > > During activation, the activated data is currently held completely in memory > for each activation request sent from author to public. > h4. The problem > When e.g. 250MB are activated on an author instance with 4 subscribers, these > 250MB are allocated 4 times in a row in RAM and garbage-collected afterwards. > Even if no OutOfMemoryError occurs during this, a high load is put on the > Garbage Collector, likely forcing the VM to perform "stop the world" full > collections, leading to unresponsiveness of the author instance for editors. > Given large enough binary data or simultaneous attempts at activating it, any > maximum heap size can be exceeded. > h4. Current implementation > This seems to be due to the default behaviour of > {{java.net.URLConnection.getOutputStream()}} used by > {{info.magnolia.module.exchangesimple.Transporter}}, which returns a subclass > of {{ByteArrayOutputStream}} that caches the whole GET request in memory. > This probably happens in order to determine the content-length before > actually sending the request. > h4. Proposed solution > The solution is to use "chunked transfer coding", as defined in [RFC > 2616|http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html#sec3.6.1]. This > needs to be explicitly enabled by calling > {{java.net.HttpURLConnection.setChunkedStreamingMode(int)}} prior to > {{getOutputStream()}}. I verified via debugger that doing so will result in a > {{sun.net.www.protocol.http.HttpURLConnection$StreamingOutputStream extends > FilterOutputStream}} instead of {{sun.net.www.http.PosterOutputStream extends > ByteArrayOutputStream}}. > Chunking requires the public server to be HTTP/1.1 compliant. In case > HTTP/1.1 compliance poses a problem e.g. with proxied public servers or weird > HTTP servers, chunking of activation requests should be configurable. There > could e.g. be a configuration NodeData > "server/activation/subscribers//useRequestChunking" with > default value "true". -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://jira.magnolia-cms.com/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira For list details see http://www.magnolia-cms.com/home/community/mailing-lists.html To unsubscribe, E-mail to:
[magnolia-dev] [JIRA] Updated: (MAGNOLIA-3390) Prevent OOME and GC load during activation of large data sets
[ http://jira.magnolia-cms.com/browse/MAGNOLIA-3390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jan Haderka updated MAGNOLIA-3390: -- Fix Version/s: 4.3.x > Prevent OOME and GC load during activation of large data sets > - > > Key: MAGNOLIA-3390 > URL: http://jira.magnolia-cms.com/browse/MAGNOLIA-3390 > Project: Magnolia > Issue Type: Improvement > Components: activation >Affects Versions: 4.3.x >Reporter: Jörg von Frantzius >Assignee: Philipp Bärfuss >Priority: Blocker > Fix For: 4.3.x, 4.4 > > > During activation, the activated data is currently held completely in memory > for each activation request sent from author to public. > h4. The problem > When e.g. 250MB are activated on an author instance with 4 subscribers, these > 250MB are allocated 4 times in a row in RAM and garbage-collected afterwards. > Even if no OutOfMemoryError occurs during this, a high load is put on the > Garbage Collector, likely forcing the VM to perform "stop the world" full > collections, leading to unresponsiveness of the author instance for editors. > Given large enough binary data or simultaneous attempts at activating it, any > maximum heap size can be exceeded. > h4. Current implementation > This seems to be due to the default behaviour of > {{java.net.URLConnection.getOutputStream()}} used by > {{info.magnolia.module.exchangesimple.Transporter}}, which returns a subclass > of {{ByteArrayOutputStream}} that caches the whole GET request in memory. > This probably happens in order to determine the content-length before > actually sending the request. > h4. Proposed solution > The solution is to use "chunked transfer coding", as defined in [RFC > 2616|http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html#sec3.6.1]. This > needs to be explicitly enabled by calling > {{java.net.HttpURLConnection.setChunkedStreamingMode(int)}} prior to > {{getOutputStream()}}. I verified via debugger that doing so will result in a > {{sun.net.www.protocol.http.HttpURLConnection$StreamingOutputStream extends > FilterOutputStream}} instead of {{sun.net.www.http.PosterOutputStream extends > ByteArrayOutputStream}}. > Chunking requires the public server to be HTTP/1.1 compliant. In case > HTTP/1.1 compliance poses a problem e.g. with proxied public servers or weird > HTTP servers, chunking of activation requests should be configurable. There > could e.g. be a configuration NodeData > "server/activation/subscribers//useRequestChunking" with > default value "true". -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://jira.magnolia-cms.com/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira For list details see http://www.magnolia-cms.com/home/community/mailing-lists.html To unsubscribe, E-mail to: