Github user afs commented on the pull request: https://github.com/apache/jena/pull/123#issuecomment-177243620 This PR protects against multiple creation of a text index (JENA-1104), not against two calls to create the same dataset for two services in Fuseki. By chance, TDB is less prone to problems if that happens but that is luck. General datasets e.g. with inference graphs, SDB or plain in-memory datasets are likely exposed to problems. Let's solve the immediate issue described in JENA-1122, then see if JENA-1104 needs addressing or whether the situations where it can still happen are uninteresting or have other problems in which case the application must be responsible for creating the index only once. For the record, there are some specific items with the current PR that I would like clarified or refuted before this code is used to address JENA-1107, if that is still needed. 1: `TextIndexLucene.close` is not reference counted. * Create text index -> T1 * Create text index -> T2 (which is T1, shared) * Close T2. * Any T1 code will now crash - the index is closed. 2: Using WeakReferences and managing `close()` seems to be duplicating lifecycle management. I am not clear that the WeakReference to the Lucene index helps because there are no finalizers, so GC fnalization does not tidy up lucene. A freed WeakReference would cause a new attempt to create the index but it will hit the state lock. 3: Creation by `createLuceneIndex` is not thread safe. It has a get-create-put timing hole. 4: Need to be clear on the contract for "same `Directory`, different Lucene configuration (`TextIndexConfig`)".
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---