Simple disk IO/Hash consumes wild amounts of memory
---------------------------------------------------

                 Key: JRUBY-2701
                 URL: http://jira.codehaus.org/browse/JRUBY-2701
             Project: JRuby
          Issue Type: Bug
          Components: Core Classes/Modules, Performance
    Affects Versions: JRuby 1.1.2
         Environment: Ubuntu, java 1.5.0_13
            Reporter: Jørgen P. Tjernø


I used this command to generate the test data (change the *10 to experiment 
with different datasizes - this is 10M)
/tmp$ JAVA_OPTS='-server' jruby -e 'open("testdata", "w") do |f| while f.pos < 
1024*1024*10; f.puts "#{f.pos} 1"; end; end'

Then I used this script to produce the memory usage:
/tmp$ JAVA_OPTS='-server -Dcom.sun.management.jmxremote' JAVA_MEM='-Xmx1024m' 
jruby -e 'occurrences = {}; File.open("testdata") { |f| f.each { |line| if 
line.strip =~ /^(.*) (\d+)$/; occurrences[$1] = $2.to_i; end } }'
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space

/tmp$ wc -l testdata
1056867 testdata
/tmp$ wc -L testdata
10 testdata

So, yeah. It's using >1024MB of memory to run this code on ~1 million lines, 
where the longest line is 10 bytes. Quick calculations show that each line 
consumes MORE than 1kb of memory (isn't that a bit high for a hash entry that 
maps strings of less than 8 bytes to an int?). :o

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://jira.codehaus.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira



---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email


Reply via email to