[ 
https://issues.apache.org/jira/browse/LUCENE-10292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17529831#comment-17529831
 ] 

Dawid Weiss commented on LUCENE-10292:
--------------------------------------

> All I was really trying to do with these tests was demonstrate that data you 
> get out of the Lookup before you call build(), can still be gotten from the 
> Lookup while build() is incrementally consuming an iterator (which may take a 
> long time if you are building up from a long iterator) and that this behavior 
> is consistent across Lookup impls (as opposed to before i filed this issue, 
> when most Lookups worked that way, but AnalyzingInfixSuggester would throw an 
> ugly exception – which was certainly confusing to users who might switch from 
> one impl to another).

I guess I am not comfortable with the fact that this test works only by a lucky 
coincidence and tests the behavior that isn't guaranteed or documented by the 
Lookup class - this got me confused and I guess it'll confuse people looking at 
this code after me. It's not a personal stab at you, it's just something that 
smells fishy around this code in general.

When I was looking at the failure and tried to debug the test, I didn't see the 
reason why this test was necessary (I looked at the Lookup class 
documentation). When I understood what the test did, I looked at the 
implementations and they seemed to be designed with a single-thread model in 
mind (external synchronization between lookups and rebuilds).

For example, even now, if you had a tight loop in one thread calling lookup on 
an FSTCompletionLookup and this loop got compiled, then there's nothing 
preventing the compiler from reading higherWeightsCompletion and 
normalCompletion fields once and never again (they're regular fields in 
FSTCompletionLookup), even if you call build there multiple times in between... 
Is this likely to happen? I don't know. Is this possible? Sure. Maybe I'm 
oversensitive because I grew up on machines with much less strict cache 
coherency protocols but code like this makes me itchy.

> I didn't set out to make any hard & fast guarantee about the thread safety of 
> all lookups – just improve the one that awas obviously inconsistent with the 
> others (progress, not perfection)

That's my point. Either we should make the Lookup interface explicitly state 
that it's safe to call the build method from another thread or we shouldn't 
really guarantee (or test) this behavior. I don't want you to revert the 
changes you made but my gut feeling is that lookup implementations should be 
designed to be single-threaded or at least immutable (one publisher-multiple 
readers model) as it makes implementing them much easier - no volatiles, 
synchronization blocks, etc. 

Concurrency concerns should be handled by the code that uses Lookups - this 
code should know whether synchronization or two concurrent instances are 
required (one doing the lookups, potentially via multiple threads, one 
rebuilding). Perhaps a change in the API is needed to separate those two phases 
(build-use) and then the downstream code has to take care of handling/ swapping 
out Lookup reference where they're used - I don't know, I just state what I 
think.

> AnalyzingInfixSuggester thread safety: lookup() fails during (re)build()
> ------------------------------------------------------------------------
>
>                 Key: LUCENE-10292
>                 URL: https://issues.apache.org/jira/browse/LUCENE-10292
>             Project: Lucene - Core
>          Issue Type: Bug
>            Reporter: Chris M. Hostetter
>            Assignee: Chris M. Hostetter
>            Priority: Major
>             Fix For: 10.0 (main), 9.2
>
>         Attachments: LUCENE-10292-1.patch, LUCENE-10292-2.patch, 
> LUCENE-10292-3.patch, LUCENE-10292.patch
>
>
> I'm filing this based on anecdotal information from a Solr user w/o 
> experiencing it first hand (and I don't have a test case to demonstrate it) 
> but based on a reading of the code the underlying problem seems self 
> evident...
> With all other Lookup implementations I've examined, it is possible to call 
> {{lookup()}} regardless of whether another thread is concurrently calling 
> {{build()}} – in all cases I've seen, it is even possible to call 
> {{lookup()}} even if {{build()}} has never been called: the result is just an 
> "empty" {{List<LookupResult>}} 
> Typically this is works because the {{build()}} method uses temporary 
> datastructures until it's "build logic" is complete, at which point it 
> atomically replaces the datastructures used by the {{lookup()}} method.   In 
> the case of {{AnalyzingInfixSuggester}} however, the {{build()}} method 
> starts by closing & null'ing out the {{protected SearcherManager 
> searcherMgr}} (which it only populates again once it's completed building up 
> it's index) and then the lookup method starts with...
> {code:java}
>     if (searcherMgr == null) {
>       throw new IllegalStateException("suggester was not built");
>     }
> {code}
> ... meaning it is unsafe to call {{AnalyzingInfixSuggester.lookup()}} in any 
> situation where another thread may be calling 
> {{AnalyzingInfixSuggester.build()}}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to