1GB.txt"); p buf.size'

Wayne Meissner (JIRA) Tue, 30 Jun 2009 20:04:26 -0700

More than 2G memory required for jruby -e 'buf = IO.read("/tmp/1GB.txt"); p 
buf.size'
-------------------------------------------------------------------------------------


                 Key: JRUBY-3784
                 URL: http://jira.codehaus.org/browse/JRUBY-3784
             Project: JRuby
          Issue Type: Bug
    Affects Versions: JRuby 1.3.1
            Reporter: Wayne Meissner
            Assignee: Thomas E Enebo
             Fix For: JRuby 1.4


Leaving aside the wisdom or otherwise of trying to read 1G of data in one go, 
JRuby fails to load a 1G file even when the jvm memory is set to 1G.

e.g.
./bin/jruby -J-Xmx2048m -e 'buf = IO.read("/tmp/1GB.txt"); p buf.size'
Error: Your application used more memory than the safety cap of 2048m.
Specify -J-Xmx####m to increase it (#### = cap size in MB).
Specify -w for full OutOfMemoryError stack trace

Part of this is due to the way the jvm does file I/O.  When doing a 1G read 
into a heap buffer, the jvm will allocate a 1G direct ByteBuffer, do the read 
into the direct buffer, and then copy from the direct buffer to the heap buffer.

Ergo, for a 1G read, it will allocate 2G of memory (1G heap, 1G direct).

Splitting reads larger than 1M up into 1M sized chunks, seems to alleviate this 
situation.

{format}
diff --git a/src/org/jruby/util/io/ChannelStream.java 
b/src/org/jruby/util/io/ChannelStream.java
index 4582eca..44b8614 100644
--- a/src/org/jruby/util/io/ChannelStream.java
+++ b/src/org/jruby/util/io/ChannelStream.java
@@ -362,10 +362,22 @@ public class ChannelStream implements Stream, Finalizable 
{
             // Now read unbuffered directly from the file
             //
             while (buf.hasRemaining()) {
-                int n = channel.read(buf);
+                final int MAX_READ_CHUNK = 1 * 1024 * 1024;
+                //
+                // When reading into a heap buffer, the jvm allocates a 
temporary
+                // direct ByteBuffer of the requested size.  To avoid 
allocating
+                // a huge direct buffer when doing ludicrous reads (e.g. 1G or 
more)
+                // we split the read up into chunks of no more than 1M
+                //
+                ByteBuffer tmp = buf.duplicate();
+                if (tmp.remaining() > MAX_READ_CHUNK) {
+                    tmp.limit(tmp.position() + MAX_READ_CHUNK);
+                }
+                int n = channel.read(tmp);
                 if (n <= 0) {
                     break;
                 }
+                buf.position(tmp.position());
             }
             eof = true;
             result.length(buf.position());
{format}

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://jira.codehaus.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email

[jruby-dev] [jira] Created: (JRUBY-3784) More than 2G memory required for jruby -e 'buf = IO.read("/tmp/1GB.txt"); p buf.size'

Reply via email to