Hi Andy,

I have doing some work on this, trying to understand what is happening. My 
findings can be summarised as follows:

-- The readings of the memory (possibly being leaked) are between 1% and 5% of 
the total heap size being used. So, I agree with you that there was not any 
memory leaks, just cache filling up. So I could conclude it was a false alarm?! 
--- Although it is not a memory leak, the memory keeps growing (not fast). I 
haven't run it long enough to reach the stable state.
-- Node.cache(false) didn't make any different
-- Found a trivial memory leak on my query execution method, by not closing the 
ByteArrayOutputStream that collects the solution set. 

I will keep doing some more testing around this area.

Please see below the testing src code.

Regards,
Emilio

public class TDBModel {

    private static Model model = null;
    public static Logger logger = Logger.getLogger("TDBModel");
    
    public static String repoDir;

    /**
     * Constructor for a Test TDB-based Model
     */
    public TDBModel(String mainOWLFile, String TDBdir)
            throws QueryExecException, QueryParseException, IOException {

        logger.info("Entering TDB-based MODEL application.");

        /**
         * ***********************************************
         * Overall Configuration details
         *************************************************
         */
        // Model Name
        // Reading Server Configuration
        logger.info("===== TDB Model Configuration =====");
        String modelName = "mainModel";
        // Locations of ontology to load
        String ontFile = mainOWLFile;

        //Repository Directory
        repoDir = TDBdir;
        logger.info("OWL Model File = " + ontFile);
        logger.info("TDB Location = " + repoDir);

        /**
         * ***********************************************
         * Back-End Mode: DB, MEM, or TDB
         *************************************************
         */
        // Ontology Model Spec
        OntModelSpec spec = new OntModelSpec(OntModelSpec.OWL_MEM);

        // Back-end shape and format
        logger.info("Creating TDB-based model");
        model = createTDBModel(modelName, spec);
        TDB.sync(model);

        logger.info("TDB Model READY!!\n");
    }

    /**
     * *************************************************
     *
     * MODEL Methods: create, update
     *
     *************************************************
     */
    private static Model createTDBModel(String modelName, OntModelSpec spec) {

        logger.info("The model name is " + modelName);

        //Check if the location for TDB exists
        File tdbDir = new File(repoDir);
        if (!tdbDir.exists() || !tdbDir.canRead()) {
            logger.error("Unable to open  " + repoDir);
            System.exit(1);
        }

        TDB.getContext().set(ARQ.symLogExec, true);

        Model bModel = TDBFactory.createModel(repoDir);

        // now create a reasoning model using this base
        OntModel aModel = ModelFactory.createOntologyModel(spec, bModel);

        return bModel;
    }

    public String updateModel(String inputQuery)
            throws QueryParseException, QueryExecException {

        logger.info("MODEL UPDATE");

        String queryResults = null;

        model.enterCriticalSection(Lock.WRITE);

        try {
            //Node.cache(false);
            UpdateRequest updateRequest = UpdateFactory.create(inputQuery);
            UpdateAction.execute(updateRequest, model);

        } finally {
            //model.rebind();
            model.commit();
            TDB.sync(model);
            
            queryResults = "Model Updated";
            model.leaveCriticalSection();
        }

        logger.info("END MODEL UPDATE");

        return queryResults;
    }

    public String querySelect(String queryInput)
            throws QueryParseException, QueryExecException, IOException, 
ConcurrentModificationException {

        String queryResults = null;
        model.enterCriticalSection(Lock.READ);
        try {
                queryResults = printSelectQueryAsJSON(queryInput);
        } finally {
            model.leaveCriticalSection();
        }

        return queryResults;
    }

    public static String printSelectQueryAsJSON(String queryString)
            throws QueryParseException, QueryExecException, IOException, 
ConcurrentModificationException {

        String queryResults = null;

        Query query = QueryFactory.create(queryString, Syntax.syntaxARQ);

        ByteArrayOutputStream out = new ByteArrayOutputStream();
        QueryExecution qexec = QueryExecutionFactory.create(query, model);

        qexec.getContext().set(ARQ.symLogExec, true);
        try {
            //Node.cache(false);
            ResultSet results;

            results = qexec.execSelect();

            ResultSetFormatter.outputAsJSON(out, results);
            queryResults = out.toString();

        } finally {
            qexec.close();
            out.flush();
            out.close();
        }
        return queryResults;
    }
}



On 4 Feb 2012, at 11:49, Andy Seaborne wrote:

> Hi Emilio,
> 
> Have you had a chance to try turning the node cache off and did it make a 
> difference?
> 
>       Andy
> 
> On 31/01/12 11:23, Andy Seaborne wrote:
>> Hi Emilio,
>> 
>> It may not be a leak so much as a cache filling up. One such cache
>> (there are others) is a NodeCache in Node. TDB also has caches. There
>> may be an interaction.
>> 
>> Try turning it off before you what the run.
>> 
>> Node.cache( false ) ;
>> 
>> I'd be interested in hearing how much difference this makes.
>> 
>> Once-upon-a-time (i.e. Java 1.4 days) it was appreciable but the last
>> time I tested it, then the effect was much less. I suspect that the
>> sophistication of memory management and GC has got better and the cost
>> of object churn is now small. Some real facts would be good through!
>> 
>> If it's not the cache directly, can you produce a complete, minimal
>> example? All the jena commiters have access to YourKit [*] so we can
>> profile code ... if we have code to profile.
>> 
>> Andy
>> 
>> [*] Thanks to YourKit - we have free use as an open source project.
>> 
>> On 30/01/12 17:50, Emilio Miguelanez wrote:
>>> Hi list,
>>> 
>>> I have developed a jena-based application, which has a TDB back-end
>>> which loads a owl file with some good number of triples. All using the
>>> Jena API.
>>> 
>>> I have done some testing (1000 queries every second), and I get a
>>> memory leak on com.hp.jena.graph.Node_Literal.
>>> 
>>> The stack is as follows
>>> 
>>> 
>>> Currently 470 leaked objects exist of class
>>> com.hp.hpl.jena.graph.Node_Literal
>>> 
>>> The objects are created at
>>> com.hp.hpl.jena.graph.Node$2.construct(java.lang.Object):238
>>> 
>>> and are being held
>>> in nodes of com.hp.hpl.jena.graph.NodeCache
>>> in present of com.hp.hpl.jena.graph.Node
>>> in elementData of java.util.Vector
>>> in classes of sun.misc.Launcher$AppClassLoader
>>> 
>>> 
>>> The leak is not that huge (101K of the total available 77M heap), but
>>> it is a leak. It starts to show up half-way through the testing
>>> session. So I am assuming that a leaked object is created every time a
>>> query is executed. The query is being done in the following fn:
>>> 
>>> public static String printSelectQueryAsJSON(Model model, String
>>> queryString)
>>> throws QueryExecException, QueryParseException, IOException,
>>> ConcurrentModificationException {
>>> 
>>> String queryResults = null;
>>> 
>>> Query query = QueryFactory.create(queryString, Syntax.syntaxARQ);
>>> 
>>> if (!query.isSelectType()) {
>>> throw new QueryExecException("Attempt to execute a SELECT query from a
>>> " + QueryEngineUtils.labelForQuery(query) + " query");
>>> }
>>> 
>>> QueryExecution qexec = QueryExecutionFactory.create(query, model);
>>> 
>>> //qexec.getContext().set(TDB.symLogExec, Explain.InfoLevel.FINE);
>>> 
>>> ByteArrayOutputStream out = new ByteArrayOutputStream();
>>> 
>>> try {
>>> ResultSet results;
>>> results = qexec.execSelect();
>>> 
>>> ResultSetFormatter.outputAsJSON(out, results);
>>> queryResults = out.toString();
>>> 
>>> } finally {
>>> qexec.close();
>>> out.flush();
>>> }
>>> 
>>> long epoch_ends = System.currentTimeMillis();
>>> logger.debug("QET (" + lQueryId + "): " + (epoch_ends - epoch_starts));
>>> 
>>> return queryResults;
>>> }
>>> 
>>> 
>>> For extra info, the set of libraries I am using is:
>>> 
>>> Jena 2.6.4
>>> ARQ-2.8.8
>>> TDB-0.8.10
>>> 
>>> 
>>> Any idea where I should start looking to solve this memory leak will
>>> be useful.
>>> 
>>> Thanks in advance,
>>> Emilio
>>> 
>>> 
>>> 
>> 
> 

---
Emilio Migueláñez Martín
[email protected]


Reply via email to