[
https://issues.apache.org/jira/browse/JENA-1138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15141336#comment-15141336
]
A. Soroka commented on JENA-1138:
---------------------------------
Yes, I agree-- but when I look at the profiling, nothing crazy is happening, or
at least, nothing that we wouldn't expect. The objects are pretty much as they
should be-- it's just that there are a lot of slots for every triple in the new
dataset impl, as you know.
I guess another way to approach this would be to bring down the general
hungriness for heap by offering some knobs and switches. For example, the
default graph currently uses three synced indexes (SPO, POS, OSP). I could add
some parameterization there and let the user choose to use one, two or three,
selecting for faster load time and lower heap usage at the expense of slower
queries in some cases.
> java.lang.OutOfMemoryError: GC overhead limit exceeded
> ------------------------------------------------------
>
> Key: JENA-1138
> URL: https://issues.apache.org/jira/browse/JENA-1138
> Project: Apache Jena
> Issue Type: Bug
> Components: Cmd line tools
> Affects Versions: Jena 3.0.1
> Environment: Oracle JDK 1.8.0, Windows 7 64bit
> Reporter: Giovanni Mels
> Labels: performance
> Attachments: sample-data.zip
>
>
> Since 3.0.1 we get {{java.lang.OutOfMemoryError: GC overhead limit exceeded}}
> exceptions when using the {{sparql}} command line tool, even on relative
> small datasets (~1.6 million triples).
> The issue occurs when the dataset is loaded in memory, so before the actual
> query execution.
> {code}
> sparql --query empty.rq --data sample-data.ttl
> {code}
> Where {{empty.rq}} contains:
> {noformat}
> SELECT * WHERE {}
> {noformat}
> This query takes ~20 seconds using Jena 2.13.0 and Jena 3.0.0, it fails with
> 3.0.1 after ~4 minutes with {{java.lang.OutOfMemoryError: GC overhead limit
> exceeded}}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)