[
https://issues.apache.org/jira/browse/JENA-1138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15140962#comment-15140962
]
A. Soroka commented on JENA-1138:
---------------------------------
I am curious to know exactly how the dataset in question was built, i.e. what
method from {{DatasetFactory}} was called. It's certainly possible to use the
new (transactional) dataset or the older (non-transactional, but much leaner)
dataset. The current results of {{DatasetFactory::create}} and
{{DatasetFactory::createGeneral}} are going to be the _old_ dataset impl, not
the new one. Only by using {{DatasetFactory::createTxnMem}} would you get the
new one. What's more, the way you use the new dataset can make a difference:
using transactions properly can lower the running costs, although as [~rvesse]
rightly says, the new in-memory dataset is inherently much hungrier for memory.
Generally, the new dataset impl has the characteristics that loading is slower
(although normally not by nearly the factor you describe) but querying is
faster (depending on your query, of course).
> java.lang.OutOfMemoryError: GC overhead limit exceeded
> ------------------------------------------------------
>
> Key: JENA-1138
> URL: https://issues.apache.org/jira/browse/JENA-1138
> Project: Apache Jena
> Issue Type: Bug
> Components: Cmd line tools
> Affects Versions: Jena 3.0.1
> Environment: Oracle JDK 1.8.0, Windows 7 64bit
> Reporter: Giovanni Mels
> Labels: performance
>
> Since 3.0.1 we get {{java.lang.OutOfMemoryError: GC overhead limit exceeded}}
> exceptions when using the {{sparql}} command line tool, even on relative
> small datasets (~1.6 million triples).
> The issue occurs when the dataset is loaded in memory, so before the actual
> query execution.
> {code}
> sparql --query empty.rq --data sample-data.ttl
> {code}
> Where {{empty.rq}} contains:
> {noformat}
> SELECT * WHERE {}
> {noformat}
> This query takes ~20 seconds using Jena 2.13.0 and Jena 3.0.0, it fails with
> 3.0.1 after ~4 minutes with {{java.lang.OutOfMemoryError: GC overhead limit
> exceeded}}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)