[ 
https://issues.apache.org/jira/browse/JENA-1138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15141276#comment-15141276
 ] 

Andy Seaborne commented on JENA-1138:
-------------------------------------

I added a transaction wrapper around the read in addGraphs - I still need 2G of 
heap.  1.5G does not work.

It is now reading inside a single transaction.

Are we sure that it is a lack of transaction because my experiment indicates 
that the memory dataset is creating a large overhead even when inside one 
transaction.  Or maybe it really does take that much space.


Experiment:
ModDatasetGeneral.addGraphs:
{noformat}
            if ( dataURLs != null ) 
            {
                if ( ds.supportsTransactions() )
                    System.err.println("TRANSACTION") ;
                if ( ds.supportsTransactions() )
                    ds.begin(ReadWrite.WRITE) ;
                
                for ( String url : dataURLs )
                    RDFDataMgr.read(ds, url) ;
                
                if ( ds.supportsTransactions() ) {
                    ds.commit() ;
                    ds.end() ;
                }
            }
{noformat}

> java.lang.OutOfMemoryError: GC overhead limit exceeded
> ------------------------------------------------------
>
>                 Key: JENA-1138
>                 URL: https://issues.apache.org/jira/browse/JENA-1138
>             Project: Apache Jena
>          Issue Type: Bug
>          Components: Cmd line tools
>    Affects Versions: Jena 3.0.1
>         Environment: Oracle JDK 1.8.0, Windows 7 64bit
>            Reporter: Giovanni Mels
>              Labels: performance
>         Attachments: sample-data.zip
>
>
> Since 3.0.1 we get {{java.lang.OutOfMemoryError: GC overhead limit exceeded}} 
> exceptions when using the {{sparql}} command line tool, even on relative 
> small datasets (~1.6 million triples).
> The issue occurs when the dataset is loaded in memory, so before the actual 
> query execution. 
> {code}
> sparql --query empty.rq --data sample-data.ttl
> {code}
> Where {{empty.rq}} contains:
> {noformat}
> SELECT * WHERE {}
> {noformat}
> This query takes ~20 seconds using Jena 2.13.0 and Jena 3.0.0, it fails with 
> 3.0.1 after ~4 minutes with {{java.lang.OutOfMemoryError: GC overhead limit 
> exceeded}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to