arne-bdt commented on issue #3226: URL: https://github.com/apache/jena/issues/3226#issuecomment-2925683911
@afs Thanks for taking a look at the idea. > There is also a mode of "MINIMAL" - it just has the TripleSet and any find pattern is done as a scan.in the description. I added MINIMAL to the modes . > Not sure changing the Graph API for all graphs is a good idea. Sorry, I was not clear enough: I would add these methods to `GraphMem2RoaringLazyIndexing `and `RoaringTripleStore` only. I am not sure the proposed API extensions are general enough to extract them into an interface, yet. > How big are the indexes compared to the triple set? I added a table with the RAM consumption. Column `GraphMem2RoaringLazyIndexing` shows the size of the triple set alone, where column `GraphMem2Roaring` displays the size of triple set plus the index structures. The space for the triples is not considered in the table, it´s only the additional data structures for the graph instances. > An alternative is a StreamRDF... StreamRDF is not as versatile as a Graph which can be used almost anywhere. > Indexing: ... As indexes, `RoaringTripleStore` only maintains S__, _P_ and __O. The permutations are covered by using the bit-and-operations of the RoaringBitmaps. One could init each index only when it is used, but that would make it more complex with limited use, to rare cases where only one of the three indices is ever needed. The TripleSet reuses free indices after deletion, so one would have to maintain an additional list of not yet indexed triples to support partial indexing. I am not sure there are many scenarios where suspending indexing and later resuming it, would be a critical use case. Multi-threading could be an option. At least, when the index is rebuild, the three basic indices could be built in parallel... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
