[ 
https://issues.apache.org/jira/browse/LUCENE-10292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17529018#comment-17529018
 ] 

Dawid Weiss commented on LUCENE-10292:
--------------------------------------

I don't see any evidence in implementations of Lookup that build() can be 
called in a thread-safe manner.

Those testLookupsDuringReBuild are only working by a lucky chance (and rarely 
still fail!). The code typically releases semaphore permissions quickly here:
{code}
    // at every stage of the slow rebuild, we should still be able to get our 
original suggestions
    for (int i = 0; i < data.size(); i++) {
      initialChecks.check(suggester);
      rebuildGate.release();
    }
{code}
while the build() method is not even invoked yet because this line:
{code}
                suggester.build(
                    new InputArrayIterator(new DelayedIterator<>(suggester, 
rebuildGate, data.iterator())));
{code}
is semaphore-blocked in the constructor parameters (InputArrayIterator). So the 
result is that for suggester.build() is typically entered a long time after the 
check look has finished. It is enough to modify the code to:
{code}
    // at every stage of the slow rebuild, we should still be able to get our 
original suggestions
    for (int i = 0; i < data.size(); i++) {
      rebuildGate.release();
      Thread.sleep(100);
      initialChecks.check(suggester);
    }
{code}

to cause repeatable failures (this isn't a suggested fix but a demonstration 
that the code is currently broken).

> AnalyzingInfixSuggester thread safety: lookup() fails during (re)build()
> ------------------------------------------------------------------------
>
>                 Key: LUCENE-10292
>                 URL: https://issues.apache.org/jira/browse/LUCENE-10292
>             Project: Lucene - Core
>          Issue Type: Bug
>            Reporter: Chris M. Hostetter
>            Assignee: Chris M. Hostetter
>            Priority: Major
>             Fix For: 10.0 (main), 9.2
>
>         Attachments: LUCENE-10292-1.patch, LUCENE-10292-2.patch, 
> LUCENE-10292-3.patch, LUCENE-10292.patch
>
>
> I'm filing this based on anecdotal information from a Solr user w/o 
> experiencing it first hand (and I don't have a test case to demonstrate it) 
> but based on a reading of the code the underlying problem seems self 
> evident...
> With all other Lookup implementations I've examined, it is possible to call 
> {{lookup()}} regardless of whether another thread is concurrently calling 
> {{build()}} – in all cases I've seen, it is even possible to call 
> {{lookup()}} even if {{build()}} has never been called: the result is just an 
> "empty" {{List<LookupResult>}} 
> Typically this is works because the {{build()}} method uses temporary 
> datastructures until it's "build logic" is complete, at which point it 
> atomically replaces the datastructures used by the {{lookup()}} method.   In 
> the case of {{AnalyzingInfixSuggester}} however, the {{build()}} method 
> starts by closing & null'ing out the {{protected SearcherManager 
> searcherMgr}} (which it only populates again once it's completed building up 
> it's index) and then the lookup method starts with...
> {code:java}
>     if (searcherMgr == null) {
>       throw new IllegalStateException("suggester was not built");
>     }
> {code}
> ... meaning it is unsafe to call {{AnalyzingInfixSuggester.lookup()}} in any 
> situation where another thread may be calling 
> {{AnalyzingInfixSuggester.build()}}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to