[
https://issues.apache.org/jira/browse/JENA-985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14651018#comment-14651018
]
Andy Seaborne commented on JENA-985:
------------------------------------
riot does not fix triples but {{riot --validate}} applies all the tests done
during TDB loading and some more. If you think the data is bad, check and fix
it before loading. Loading is not transactional and even if it were, broken
data causing an abort is going to roll everything back.
> Iterate using Apache Jena ExtendedIterator on Graph with big amount of triples
> ------------------------------------------------------------------------------
>
> Key: JENA-985
> URL: https://issues.apache.org/jira/browse/JENA-985
> Project: Apache Jena
> Issue Type: Bug
> Components: Core
> Affects Versions: Jena 2.13.0
> Environment: *Hardware*
> Windows 7 64-bit
> Intel Core i7 4785T @ 2.20GHz
> RAM 16,0GB DDR3
> 465GB Samsung SSD 850 EVO 500G SCSI Disk Device (SSD)
> *Software environment*
> java version "1.7.0_75"
> Java(TM) SE Runtime Environment (build 1.7.0_75-b13)
> Java HotSpot(TM) 64-Bit Server VM (build 24.75-b04, mixed mode)
> *Running options*
> VM options: -Xmx14g
> Reporter: Eugene Tenkaev
> Priority: Minor
>
> I'm generating Apache Jena Graph from DBpedia dumps and now I want iterate
> through all "dbpedia-owl:abstract".
> So I do something like this:
> {code:java}
> ExtendedIterator<Triple> iterator = Graph.find(Node.ANY,
> NodeFactory.createURI("dbpedia-owl:abstract"), Node.ANY);
> {code}
> But then I try to iterate, memory consumption is increased, so looks like
> "ExtendedIterator" store found nodes.
> I use VisualVM profiler and found that while I iterate, count of
> "com.hp.hpl.jena.graph.Node_URI" is increasing.
> I try to do "iterator.reset()" but this takes no effect.
> Is this bug or feature?:D
> Can I iterate through all DBpedia abstracts without storing nodes and without
> increasing consumption of memory that gc can't freed?
> Sorry for my bad english.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)