[GitHub] jena pull request: Add memoizing of LuceneTextIndexes so that ther...

afs Sat, 30 Jan 2016 08:55:37 -0800

Github user afs commented on the pull request:

    https://github.com/apache/jena/pull/123#issuecomment-177243620
  
    This PR protects against multiple creation of a text index (JENA-1104), not 
against two calls to create the same dataset for two services in Fuseki. By 
chance, TDB is less prone to problems if that happens but that is luck.  
General datasets e.g. with inference graphs, SDB or plain in-memory datasets 
are likely exposed to problems.
    
    Let's solve the immediate issue described in JENA-1122, then see if 
JENA-1104 needs addressing or whether the situations where it can still happen 
are uninteresting or have other problems in which case the application must be 
responsible for creating the index only once.
    
    For the record, there are some specific items with the current PR that I 
would like clarified or refuted before this code is used to address JENA-1107, 
if that is still needed.
    
    1: `TextIndexLucene.close` is not reference counted.
    
    * Create text index -> T1
    * Create text index -> T2 (which is T1, shared)
    * Close T2.
    * Any T1 code will now crash - the index is closed.
    
    2: Using WeakReferences and managing `close()` seems to be duplicating 
lifecycle management.
    
    I am not clear that the WeakReference to the Lucene index helps because 
there are no finalizers, so GC fnalization does not tidy up lucene.  A freed 
WeakReference would cause a new attempt to create the index but it will hit the 
state lock.
    
    3: Creation by `createLuceneIndex` is not thread safe.  It has a 
get-create-put timing hole.
    
    4: Need to be clear on the contract for "same `Directory`, different Lucene 
configuration (`TextIndexConfig`)".




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] jena pull request: Add memoizing of LuceneTextIndexes so that ther...

Reply via email to