Github user afs commented on the pull request:

    https://github.com/apache/jena/pull/123#issuecomment-174066342
  
    **Design**
    
    Protecting the text index this way sort of works for TDB specifically 
because of an internal feature of TDB (it manages storage to stop duplication) 
which is not a guaranteed feature. Other dataset implementations will not work 
out so nicely. It will be like two separate datasets and one index will and 
probably lead to corruption or inconsistent reading (c.f. email ["transactions 
and 
docProducers"](http://mail-archives.apache.org/mod_mbox/jena-users/201601.mbox/%3C568FD70B.8060301%40epimorphics.com%3E)).
    
    On [JENA-1122](https://issues.apache.org/jira/browse/JENA-1122) I 
summarized discussions up to here as two options suggested:
    
    1. Internal static state in `TextDatasetFactory` that the same datasets 
object is returned each time. c.f. TDB's StoreConnection. Extends sharing of 
text datasets to work with java/API uses but not "any dataset" in Fuseki 
configurations.
    2. Fuseki (or in `DatasetAssembler` maybe) assembling datasets deals with 
sharing using the graph structure. This copes with any dataset but not API use. 
    
    The first one looks hard because of choosing the key to include the dataset 
in the general case.
    
    The second one is easier to do because there is a natural key of the 
resource (URI, bnode) for the dataset. Bonus would a similar per-text index 
assembler check on reuse 
[JENA-1104](https://issues.apache.org/jira/browse/JENA-1104).
    
    There is one minor point - Fuseki can have multiple assembler files and 
badly chosen, clashing dataset URIs (solution - keep a list of all URIs acorss 
assembler configs - useful check anyway)
    
    The ideal for [JENA-1122](https://issues.apache.org/jira/browse/JENA-1122) 
is this PR (simplified?) to protect text indexes and (2) above to allow complex 
configurations.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

Reply via email to