[
https://issues.apache.org/jira/browse/JENA-1138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15141276#comment-15141276
]
Andy Seaborne commented on JENA-1138:
-------------------------------------
I added a transaction wrapper around the read in addGraphs - I still need 2G of
heap. 1.5G does not work.
It is now reading inside a single transaction.
Are we sure that it is a lack of transaction because my experiment indicates
that the memory dataset is creating a large overhead even when inside one
transaction. Or maybe it really does take that much space.
Experiment:
ModDatasetGeneral.addGraphs:
{noformat}
if ( dataURLs != null )
{
if ( ds.supportsTransactions() )
System.err.println("TRANSACTION") ;
if ( ds.supportsTransactions() )
ds.begin(ReadWrite.WRITE) ;
for ( String url : dataURLs )
RDFDataMgr.read(ds, url) ;
if ( ds.supportsTransactions() ) {
ds.commit() ;
ds.end() ;
}
}
{noformat}
> java.lang.OutOfMemoryError: GC overhead limit exceeded
> ------------------------------------------------------
>
> Key: JENA-1138
> URL: https://issues.apache.org/jira/browse/JENA-1138
> Project: Apache Jena
> Issue Type: Bug
> Components: Cmd line tools
> Affects Versions: Jena 3.0.1
> Environment: Oracle JDK 1.8.0, Windows 7 64bit
> Reporter: Giovanni Mels
> Labels: performance
> Attachments: sample-data.zip
>
>
> Since 3.0.1 we get {{java.lang.OutOfMemoryError: GC overhead limit exceeded}}
> exceptions when using the {{sparql}} command line tool, even on relative
> small datasets (~1.6 million triples).
> The issue occurs when the dataset is loaded in memory, so before the actual
> query execution.
> {code}
> sparql --query empty.rq --data sample-data.ttl
> {code}
> Where {{empty.rq}} contains:
> {noformat}
> SELECT * WHERE {}
> {noformat}
> This query takes ~20 seconds using Jena 2.13.0 and Jena 3.0.0, it fails with
> 3.0.1 after ~4 minutes with {{java.lang.OutOfMemoryError: GC overhead limit
> exceeded}}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)