Håvard Wanvik Stenersen created JENA-1140:
---------------------------------------------

             Summary: Jena 3.0.1 model halts reading large rdf file partway 
through
                 Key: JENA-1140
                 URL: https://issues.apache.org/jira/browse/JENA-1140
             Project: Apache Jena
          Issue Type: Bug
          Components: Jena, RDF API
    Affects Versions: Jena 3.0.1
         Environment: Eclipse on Windows 7, 8, Mac
            Reporter: Håvard Wanvik Stenersen
             Fix For: Jena 2.7.4


The progress halts, or becomes slow to the point where progress is unnoticable, 
without execution stopping or crashing, when attempting to read a large 
(~250MB) turtle rdf file into a Jena model, created with 
org.apache.jena.rdf.model.ModelFactory.createDefaultModel(), using 
org.apache.jena.rdf.model.Model's read() method (tested with both the methods 
using String url and InputStream in).

The progress will continue until the process uses 1-1.5GB RAM, and progress 
halts, but execution neither stops nor crashes. The code on the bottom displays 
the behaviour with a progress bar for the file being read.

This has been the case for my laptop running Windows 10 using
Eclipse
Version: Mars.1 Release (4.5.1)
Build id: 20150924-1200

My desktop running Windows 7 using 
Eclipse
Version: Kepler Service Release 2
Build id: 20140224-627

My professor's Mac using Eclipse, however I don't know which versions.

All three systems were employing Apache Jena 3.0.1, and all of them experienced 
the same issue.

I have attempted to manually set the max heap size of the JVM by using the 
-Xmx3G, however the result did not change.

Employing Apache Jena Version 2.7.4, and using the same resources in the 
com.hp.hpl package instead of org.apache fixed the problem on all three systems.

Here is the java test code:

{code:title=ReadLotsOfRDF.java|borderStyle=solid}
import java.io.BufferedInputStream;
import java.io.FileInputStream;

import com.hp.hpl.jena.rdf.model.Model;
import com.hp.hpl.jena.rdf.model.ModelFactory;

import javax.swing.JFrame;
import javax.swing.ProgressMonitorInputStream;

public class ReadLotsOfRDF {

        public static void main(String[] args) throws java.io.IOException {
                // create a test frame with a "press me" button
                final JFrame f = new JFrame("Sample");


                Model m = ModelFactory.createDefaultModel();
                m.read(new BufferedInputStream(
                                new ProgressMonitorInputStream(f,"Progress",
                                                new 
FileInputStream("LSQ-BM.ttl"))), null, "TTL");
                System.out.println(m.size());

        }

}
{code}

The "LSQ-BM.ttl" file can be (and was) retrieved from 
[here|https://drive.google.com/file/d/0B1tUDhWNTjO-UGhDTWx5U1EyWTg/view].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to