[ https://issues.apache.org/jira/browse/LUCENE-10292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17529831#comment-17529831 ]
Dawid Weiss commented on LUCENE-10292: -------------------------------------- > All I was really trying to do with these tests was demonstrate that data you > get out of the Lookup before you call build(), can still be gotten from the > Lookup while build() is incrementally consuming an iterator (which may take a > long time if you are building up from a long iterator) and that this behavior > is consistent across Lookup impls (as opposed to before i filed this issue, > when most Lookups worked that way, but AnalyzingInfixSuggester would throw an > ugly exception – which was certainly confusing to users who might switch from > one impl to another). I guess I am not comfortable with the fact that this test works only by a lucky coincidence and tests the behavior that isn't guaranteed or documented by the Lookup class - this got me confused and I guess it'll confuse people looking at this code after me. It's not a personal stab at you, it's just something that smells fishy around this code in general. When I was looking at the failure and tried to debug the test, I didn't see the reason why this test was necessary (I looked at the Lookup class documentation). When I understood what the test did, I looked at the implementations and they seemed to be designed with a single-thread model in mind (external synchronization between lookups and rebuilds). For example, even now, if you had a tight loop in one thread calling lookup on an FSTCompletionLookup and this loop got compiled, then there's nothing preventing the compiler from reading higherWeightsCompletion and normalCompletion fields once and never again (they're regular fields in FSTCompletionLookup), even if you call build there multiple times in between... Is this likely to happen? I don't know. Is this possible? Sure. Maybe I'm oversensitive because I grew up on machines with much less strict cache coherency protocols but code like this makes me itchy. > I didn't set out to make any hard & fast guarantee about the thread safety of > all lookups – just improve the one that awas obviously inconsistent with the > others (progress, not perfection) That's my point. Either we should make the Lookup interface explicitly state that it's safe to call the build method from another thread or we shouldn't really guarantee (or test) this behavior. I don't want you to revert the changes you made but my gut feeling is that lookup implementations should be designed to be single-threaded or at least immutable (one publisher-multiple readers model) as it makes implementing them much easier - no volatiles, synchronization blocks, etc. Concurrency concerns should be handled by the code that uses Lookups - this code should know whether synchronization or two concurrent instances are required (one doing the lookups, potentially via multiple threads, one rebuilding). Perhaps a change in the API is needed to separate those two phases (build-use) and then the downstream code has to take care of handling/ swapping out Lookup reference where they're used - I don't know, I just state what I think. > AnalyzingInfixSuggester thread safety: lookup() fails during (re)build() > ------------------------------------------------------------------------ > > Key: LUCENE-10292 > URL: https://issues.apache.org/jira/browse/LUCENE-10292 > Project: Lucene - Core > Issue Type: Bug > Reporter: Chris M. Hostetter > Assignee: Chris M. Hostetter > Priority: Major > Fix For: 10.0 (main), 9.2 > > Attachments: LUCENE-10292-1.patch, LUCENE-10292-2.patch, > LUCENE-10292-3.patch, LUCENE-10292.patch > > > I'm filing this based on anecdotal information from a Solr user w/o > experiencing it first hand (and I don't have a test case to demonstrate it) > but based on a reading of the code the underlying problem seems self > evident... > With all other Lookup implementations I've examined, it is possible to call > {{lookup()}} regardless of whether another thread is concurrently calling > {{build()}} – in all cases I've seen, it is even possible to call > {{lookup()}} even if {{build()}} has never been called: the result is just an > "empty" {{List<LookupResult>}} > Typically this is works because the {{build()}} method uses temporary > datastructures until it's "build logic" is complete, at which point it > atomically replaces the datastructures used by the {{lookup()}} method. In > the case of {{AnalyzingInfixSuggester}} however, the {{build()}} method > starts by closing & null'ing out the {{protected SearcherManager > searcherMgr}} (which it only populates again once it's completed building up > it's index) and then the lookup method starts with... > {code:java} > if (searcherMgr == null) { > throw new IllegalStateException("suggester was not built"); > } > {code} > ... meaning it is unsafe to call {{AnalyzingInfixSuggester.lookup()}} in any > situation where another thread may be calling > {{AnalyzingInfixSuggester.build()}} -- This message was sent by Atlassian Jira (v8.20.7#820007) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org