[ https://issues.apache.org/jira/browse/LUCENE-9791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17287196#comment-17287196 ]
Uwe Schindler commented on LUCENE-9791: --------------------------------------- Another idea would be to remove the scratch BytesRef: Maybe we can simply access the byte[] behind the BytesRefHash and use a standard Arrays.equals() with correct offset and count. As this is a private method, I see no issue in this. Nevertheless: You should never ever add new entries while doing lookups at same time. IMHO, a class like this should allow lookups concurrently, once all entries were added (like a HashMap: If you have build a HashMap, you can concurerntly access the getters/contains - but never ever change it again). Maybe another clean way would be some method in BytesRefHash that returns a read-only clone. > Monitor (aka Luwak) has concurrency issues related to BytesRefHash#find > ----------------------------------------------------------------------- > > Key: LUCENE-9791 > URL: https://issues.apache.org/jira/browse/LUCENE-9791 > Project: Lucene - Core > Issue Type: Bug > Components: core/other > Affects Versions: master (9.0), 8.7, 8.8 > Reporter: Paweł Bugalski > Priority: Major > Attachments: LUCENE-9791.patch > > > _org.apache.lucene.monitor.Monitor_ can sometimes *NOT* match a document that > should be matched by one of registered queries if match operations are run > concurrently from multiple threads. > This is because sometimes in a concurrent environment > _TermFilteredPresearcher_ might not select a query that could later on match > one of documents being matched. > Internally _TermFilteredPresearcher_ is using a term acceptor: an instance of > _org.apache.lucene.monitor.QueryIndex.QueryTermFilter_. _QueryTermFilter_ is > correctly initialized under lock and its internal state (a map of > _org.apache.lucene.util.BytesRefHash_ instances) is correctly published. > Later one when those instances are used concurrently a problem with > _org.apache.lucene.util.BytesRefHash#find_ is triggered since it is not > thread safe. > _org.apache.lucene.util.BytesRefHash#find_ internally is using a private > _org.apache.lucene.util.BytesRefHash#equals_ method, which is using an > instance field _scratch1_ as a temporary buffer to compare its _ByteRef_ > parameter with contents of _ByteBlockPool_. This is not thread safe and can > cause incorrect answers as well as _ArrayOutOfBoundException_. > __ > -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org