On 18/07/11 15:32, Al Baker wrote:
Hi Andy,
It's 32-bit -- no matter how much memory I feed the JVM, this always comes
up. At first, I thought I wasn't giving it enough RAM - but it was only 80k
triples and no amount of memory seemed enough. That's when I realized this
particular program was "RDFizing" web content - in this case blog posts. So
the literals could be quite large, e.g. several kilobytes.
I'll see if I can create a simple test case to reproduce the issue.
On a related note, I just tried TDB on a 32bit and a 64bit 512 Linode -
perhaps because of this issue, perhaps the app uses too much memory as it
is. Either way, in both instances as that VM approaches complete memory
usage on the server and it starts to swap - everything slows down to be
completely unusable. At first I thought TDB 64-bit would help, as the
memory-mapped files surely wouldn't wind up grinding the swap, but it only
appeared to get to maximum memory usage faster than when on a 32bit system.
It would be highly desirable if there was a "disable caching" for low-memory
environments.
I started poking around the TDB source tree last night, but nothing jumped
out as where the caching would be - can you point me in the right direction
here?
Al,
Not a easy (indeed, possible) as it should be to set the caching sizes.
Good news is that as part of TxTDB I've had to revisit how datasets get
built and making parameter setting sensibly possible is one thing I want
to do.
Bad news is it isn't done.
Currently, pre-Tx it's supposed to work but it's not tested. Dataset
creation is done in SetupTDB taking constants from SystemTDB as below.
Ther is an undocumented properties file that should set the values.
In TxTDB, DatasetBuilderStd will do all building. Currently, it only
builds for transaction usage via StoreConnection - the old way is still
via SetupTDB and any built datasets ejected from the TDBFactory cache if
you use them transactionally..
DatasetBuilderStd uses a Params object. I have just checked in a
version remembering to take params from node id caching from SystemTDB.
You have to set the
SystemTDB.Node2NodeIdCacheSize
SystemTDB.NodeId2NodeCacheSize
for the node caches and for 32 bit machines:
SystemTDB.BlockWriteCacheSize
SystemTDB.BlockReadCacheSize
If you are on a small machine, then 32 bit mode is probably better
anyway. Mapped files are very keen on taking up the whole machine to
the performance-exclusion of anything else.
As ever in Java, the heap size should be less than the real RAM size or
swap death will occur, as you have found out.
Andy