[ https://issues.apache.org/jira/browse/SOLR-13851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16956343#comment-16956343 ]
Chris M. Hostetter commented on SOLR-13851: ------------------------------------------- {quote}And as you note, we always use the ID. Perhaps these might be renamed... {quote} Yeah, i think the key changes should be: * both existing methods should be deprecated in 8x and removed in master ** getFirstMatch should be documented to note it's current peculiar state re: assertions and non-unique field usage * a new method should replace them that takes in _only_ the BytesRef and fails with a (non-assert) error if: ** the current schema doesn't use a uniqueKey field ** more then 1 doc is found matching the specified BytesRef in the uniqueKey field The exact return value/semantics of the new method should probably be .... something less likely to be missunderstood then a long with two parts that you have to bitshift to extract? i get the efficiency value in returning the perSegment docId for usecase that are already dealing with per-segment readers, but there's also a lot of super-simple cases (like the existing callers of getFirstMatch) that just want the (top level) docId and don't care about the segments, and i worry that two methods with similra names, but one that returns an "int" (global) docId and another that returns a "long" (bitshifted) segId+docId might lead plugin writers to confusing bugs (that could silently "work" on test indexes with only one segment) Maybe a simple container object with easy to understand accessor methods like: * {{boolean exists()}} * {{int getDocId()}} * {{int getLeafContextId()}} * {{int getDocIdInLeafContext()}} ? > SolrIndexSearcher.getFirstMatch trips assertion if multiple matches > ------------------------------------------------------------------- > > Key: SOLR-13851 > URL: https://issues.apache.org/jira/browse/SOLR-13851 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Reporter: Chris M. Hostetter > Priority: Major > > the documentation for {{SolrIndexSearcher.getFirstMatch}} says... > {quote} > Returns the first document number containing the term <code>t</code> Returns > -1 if no document was found. This method is primarily intended for clients > that want to fetch documents using a unique identifier." > @return the first document number containing the term > {quote} > But SOLR-12366 refactored {{SolrIndexSearcher.getFirstMatch}} to eliminate > it's previous implementation and replace it with a call to (a refactored > version of) {{SolrIndexSearcher.lookupId}} -- but the code in {{lookupId}} > was always designed *explicitly* for dealing with a uniqueKey field, and has > an assertion that once it finds a match _there will be no other matches in > the index_ > This means that even though {{getFirstMatch}} is _intended_ for fields that > are unique between documents, i it's used on a field that is not unique, it > can trip an assertion. > At a minimum we need to either "fix" {{getFirstMatch}} to behave as > documented, or fix it's documetation. > Given that the current behavior has now been in place since Solr 7.4, and > given that all existing uses in "core" solr code are for looking up docs by > uniqueKey, it's probably best to simply fix the documentation, but we should > also consider replacing hte assertion with an IllegalStateException, or > SolrException -- anything not dependent on having assertions enabled -- to > prevent silent bugs. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org