Re: leak but where after parsing rdf files?

Andy Seaborne Tue, 25 Jan 2011 02:15:01 -0800


On 25/01/11 09:33, Hasan Hasan wrote:

Hi Andy,

thanks for taking a look at the code.
This means that there is a limit to the number of triples with large
literals that can be returned by jenaGraph.find(). Right?

Not in the design - Graph.find() returns a streaming iterator from TDB.if the application is keeping the triples returned, then it takes space,RDF terms are materialized to return them - there is no delayedevaluation there.

But once the iterator from Graph.find has returned a triples, it's notin TDB at all. There is an issue with how the node table cache mightgrow because of large literals in it, but is is limited to a maximumnumber of entries. Turn the cache size down.

If this limit is
exceeded, then it can lead to outofmemoryerror exception. And this limit
depends on max memory allocated for heap, the size of literals ?


And the size of the cache.

So to see whether there is a memory leak, I could try to loop over
jenaGraph.find() where in each iteration there shouldn't be a heap memory
exception.


If the heap is big enough for the cache.  The worst case is pretty big.

I'll test it now and let you know.
But we'll consider your suggestion to not have large literals in the
triples, but their references.

Cheers
Hasan


let me know how it goes,

        Andy

Re: leak but where after parsing rdf files?

Reply via email to