arne-bdt opened a new pull request, #3245:
URL: https://github.com/apache/jena/pull/3245
Added optional indexing strategies for `GraphMem2Roaring `and its
`RoaringTripleStore`.
GitHub issue resolved #3226
Pull request Description:
- added new enum `org.apache.jena.mem2.IndexingStrategy`.java with the
following strategies:
- EAGER, LAZY, LAZY_PARALLEL, MANUAL, MINIMAL
- implemented all strategies in
`org.apache.jena.mem2.store.roaring.RoaringTripleStore` and added the following
methods:
- `#getIndexingStrategy`
- `#clearIndex`
- `#initializeIndex`
- `#initializeIndexParallel`
- `#isIndexInitialized`
- implemented additional tests for `RoaringTripleStore`
- extended `GraphMem2Roaring `to support the indexing strategies and added
new methods:
- `#getIndexingStrategy`
- `#clearIndex`
- `#initializeIndex`
- `#initializeIndexParallel`
- `#isIndexInitialized`
- implemented additional tests for `GraphMem2Roaring`
- refactored `RoaringTripleStore` by extracting:
- `org.apache.jena.mem2.store.roaring.NodesToBitmapsMap`
- `org.apache.jena.mem2.store.roaring.TripleSet`
- `org.apache.jena.mem2.store.roaring.strategies.StoreStrategy`
- `org.apache.jena.mem2.store.roaring.strategies.EagerStoreStrategy`
- `org.apache.jena.mem2.store.roaring.strategies.LazyStoreStrategy`
- `org.apache.jena.mem2.store.roaring.strategies.ManualStoreStrategy`
- `org.apache.jena.mem2.store.roaring.strategies.MinimalStoreStrategy`
- added new record `FastHashSet.IndexedKey` to pair index and key of an entry
- added new methods to `org.apache.jena.mem2.collection.FastHashSet`:
- `#indexedKeyIterator`
- `#indexedKeySpliterator`
- `#indexedKeyStream`
- `#indexedKeyStreamParallel`
- added `org.apache.jena.mem2.iterator.SparseArrayIndexedIterator`
- added `org.apache.jena.mem2.spliterator.SparseArrayIndexedSpliterator`
- added tests for new methods and classes
- fixed minor bugs in existing `SparseArraySpliteratorTest `and
`SparseArraySubSpliteratorTest`
- updated `org.apache.jena.mem.graph.helper.Context` and
`GraphTripleNodeHelperCurrent` to support parameterized benchmarks for the new
indexing strategies
- updated existing benchmark tests with reasonable default parameters for
the new indexing strategies
- added benchmark test `org.apache.jena.mem.graph.TestGraphInitializeIndex`
to evaluate the performance of `#initializeIndex` vs. `#initializeIndexParallel`
----
- [x] Tests are included.
- **On the [Apache Jena website](https://github.com/apache/jena-site/)
there is no special documentation for the GraphMem2Roaring. Only the javadoc
may need an update. Does this have to be part of this PR?**
- [x] Commits have been squashed to remove intermediate development commit
messages.
- [x] Key commit messages start with the issue number (GH-xxxx)
By submitting this pull request, I acknowledge that I am making a
contribution to the Apache Software Foundation under the terms and conditions
of the [Contributor's
Agreement](https://www.apache.org/licenses/contributor-agreements.html).
----
See the [Apache Jena "Contributing"
guide](https://github.com/apache/jena/blob/main/CONTRIBUTING.md).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]