On 25/01/11 09:33, Hasan Hasan wrote:
Hi Andy,
thanks for taking a look at the code.
This means that there is a limit to the number of triples with large
literals that can be returned by jenaGraph.find(). Right?
Not in the design - Graph.find() returns a streaming iterator from TDB.
if the application is keeping the triples returned, then it takes space,
RDF terms are materialized to return them - there is no delayed
evaluation there.
But once the iterator from Graph.find has returned a triples, it's not
in TDB at all. There is an issue with how the node table cache might
grow because of large literals in it, but is is limited to a maximum
number of entries. Turn the cache size down.
If this limit is
exceeded, then it can lead to outofmemoryerror exception. And this limit
depends on max memory allocated for heap, the size of literals ?
And the size of the cache.
So to see whether there is a memory leak, I could try to loop over
jenaGraph.find() where in each iteration there shouldn't be a heap memory
exception.
If the heap is big enough for the cache. The worst case is pretty big.
I'll test it now and let you know.
But we'll consider your suggestion to not have large literals in the
triples, but their references.
Cheers
Hasan
let me know how it goes,
Andy