On 18/07/11 15:32, Al Baker wrote:
Hi Andy,

It's 32-bit -- no matter how much memory I feed the JVM, this always comes
up. At first, I thought I wasn't giving it enough RAM - but it was only 80k
triples and no amount of memory seemed enough.  That's when I realized this
particular program was "RDFizing" web content - in this case blog posts.  So
the literals could be quite large, e.g. several kilobytes.

I'll see if I can create a simple test case to reproduce the issue.

On a related note, I just tried TDB on a 32bit and a 64bit 512 Linode -
perhaps because of this issue, perhaps the app uses too much memory as it
is.  Either way, in both instances as that VM approaches complete memory
usage on the server and it starts to swap - everything slows down to be
completely unusable.  At first I thought TDB 64-bit would help, as the
memory-mapped files surely wouldn't wind up grinding the swap, but it only
appeared to get to maximum memory usage faster than when on a 32bit system.
It would be highly desirable if there was a "disable caching" for low-memory
environments.

I started poking around the TDB source tree last night, but nothing jumped
out as where the caching would be - can you point me in the right direction
here?

Al,

Not a easy (indeed, possible) as it should be to set the caching sizes.

Good news is that as part of TxTDB I've had to revisit how datasets get built and making parameter setting sensibly possible is one thing I want to do.

Bad news is it isn't done.

Currently, pre-Tx it's supposed to work but it's not tested. Dataset creation is done in SetupTDB taking constants from SystemTDB as below. Ther is an undocumented properties file that should set the values.

In TxTDB, DatasetBuilderStd will do all building. Currently, it only builds for transaction usage via StoreConnection - the old way is still via SetupTDB and any built datasets ejected from the TDBFactory cache if you use them transactionally..

DatasetBuilderStd uses a Params object. I have just checked in a version remembering to take params from node id caching from SystemTDB.

You have to set the

SystemTDB.Node2NodeIdCacheSize
SystemTDB.NodeId2NodeCacheSize

for the node caches and for 32 bit machines:
SystemTDB.BlockWriteCacheSize
SystemTDB.BlockReadCacheSize

If you are on a small machine, then 32 bit mode is probably better anyway. Mapped files are very keen on taking up the whole machine to the performance-exclusion of anything else.

As ever in Java, the heap size should be less than the real RAM size or swap death will occur, as you have found out.

        Andy

Reply via email to