arne-bdt commented on issue #3226:
URL: https://github.com/apache/jena/issues/3226#issuecomment-2925683911

   @afs Thanks for taking a look at the idea.
   
   > There is also a mode of "MINIMAL" - it just has the TripleSet and any find 
pattern is done as a scan.in the description.
   
   I added MINIMAL to the modes .
   
   > Not sure changing the Graph API for all graphs is a good idea.
   
   Sorry, I was not clear enough: I would add these methods to 
`GraphMem2RoaringLazyIndexing  `and `RoaringTripleStore` only. 
   I am not sure the proposed API extensions are general enough to extract them 
into an interface, yet.
   
   > How big are the indexes compared to the triple set?
   
   I added a table with the RAM consumption. Column 
`GraphMem2RoaringLazyIndexing` shows the size of the triple set alone, where 
column `GraphMem2Roaring` displays the size of triple set plus the index 
structures. 
   The space for the triples is not considered in the table, it´s only the 
additional data structures for the graph instances.
   
   > An alternative is a StreamRDF...
   
   StreamRDF is not as versatile as a Graph which can be used almost anywhere.
   
   > Indexing: ...
   
   As indexes, `RoaringTripleStore` only maintains S__, _P_ and __O. The 
permutations are covered by using the bit-and-operations of the RoaringBitmaps. 
One could init each index only when it is used, but that would make it more 
complex with limited use, to rare cases where only one of the three indices is 
ever needed.
   
   The TripleSet reuses free indices after deletion, so one would have to 
maintain an additional list of not yet indexed triples to support partial 
indexing. I am not sure there are many scenarios where suspending indexing and 
later resuming it, would be a critical use case.
   
   Multi-threading could be an option. At least, when the index is rebuild, the 
three basic indices could be built in parallel...


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to