I ran today: https://jena.svn.sourceforge.net/svnroot/jena/TDB/trunk/src-dev/reports/ReportOutOfMemoryManyGraphsTDB.java
in Eclispe in direct mode. It has some configuration choices you might like to try. Max mem: 910M DIRECT mode > Starting test: Fri Mar 25 13:57:02 GMT 2011 > Initial number of indexed graphs: 0 100 at: Fri Mar 25 13:57:04 GMT 2011 200 at: Fri Mar 25 13:57:04 GMT 2011 300 at: Fri Mar 25 13:57:05 GMT 2011 400 at: Fri Mar 25 13:57:06 GMT 2011 500 at: Fri Mar 25 13:57:06 GMT 2011 600 at: Fri Mar 25 13:57:07 GMT 2011 700 at: Fri Mar 25 13:57:07 GMT 2011 800 at: Fri Mar 25 13:57:08 GMT 2011 900 at: Fri Mar 25 13:57:08 GMT 2011 1000 at: Fri Mar 25 13:57:09 GMT 2011 .... 98000 at: Fri Mar 25 14:06:47 GMT 2011 98100 at: Fri Mar 25 14:06:47 GMT 2011 98200 at: Fri Mar 25 14:06:48 GMT 2011 98300 at: Fri Mar 25 14:06:48 GMT 2011 98400 at: Fri Mar 25 14:06:49 GMT 2011 98500 at: Fri Mar 25 14:06:50 GMT 2011 98600 at: Fri Mar 25 14:06:50 GMT 2011 98700 at: Fri Mar 25 14:06:52 GMT 2011 98800 at: Fri Mar 25 14:06:52 GMT 2011 98900 at: Fri Mar 25 14:06:53 GMT 2011 99000 at: Fri Mar 25 14:06:53 GMT 2011 99100 at: Fri Mar 25 14:06:54 GMT 2011 99200 at: Fri Mar 25 14:06:55 GMT 2011 99300 at: Fri Mar 25 14:06:55 GMT 2011 99400 at: Fri Mar 25 14:06:56 GMT 2011 99500 at: Fri Mar 25 14:06:56 GMT 2011 99600 at: Fri Mar 25 14:06:57 GMT 2011 99700 at: Fri Mar 25 14:06:58 GMT 2011 99800 at: Fri Mar 25 14:06:59 GMT 2011 99900 at: Fri Mar 25 14:07:00 GMT 2011 100000 at: Fri Mar 25 14:07:00 GMT 2011 > Done at: Fri Mar 25 14:07:04 GMT 2011 100,000 graphs in 601.98 sec On 25/03/11 13:50, Frank Budinsky wrote:
Hi Andy and all, I finally managed to get a relatively powerful machine set up
Details?
and I reran the test program I sent you, but unfortunately, it still runs orders of magnitude slower than the numbers you got when you tried it. The hardware I used this time is similar to what you are using, the only significant difference is that it's running Window's 7 instead of Ubuntu. I know Linux is somewhat faster than Windows, but I don't expect we can blame Microsoft for a change from 573.87 seconds to about 4 hours :-) Are you sure that your numbers are correct? How big was the TDB database on disk at the end of the run?
3.9G DB1
Do you have any other ideas what may be wrong with my configuration?
Windows server or desktop? (server is better at I/O) 64 bit? Is there a virus checker?
I would very much appreciate it if others on this mailing list could also give it a quick try. I'd like to know if it's just me (and my colleagues), or is there some kind of pattern to explain this huge difference. Here is the simple test program again (inlined this time, since Apache seems to throw away attachments). To run it, just change the TDB_DIR constant to some empty directory. The test program loads100,000 datagraphs (about 100 triples each -> 10M triples total). It prints a message on the console at every 100, so if it's taking seconds for each println, you'll know very quickly that it will take hours to run, instead of a few minutes, like Andy has seen. Thanks, Frank.
