[ 
https://issues.apache.org/jira/browse/SOLR-13851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16956343#comment-16956343
 ] 

Chris M. Hostetter commented on SOLR-13851:
-------------------------------------------

{quote}And as you note, we always use the ID. Perhaps these might be renamed...
{quote}
Yeah, i think the key changes should be:
 * both existing methods should be deprecated in 8x and removed in master
 ** getFirstMatch should be documented to note it's current peculiar state re: 
assertions and non-unique field usage
 * a new method should replace them that takes in _only_ the BytesRef and fails 
with a (non-assert) error if:
 ** the current schema doesn't use a uniqueKey field
 ** more then 1 doc is found matching the specified BytesRef in the uniqueKey 
field

The exact return value/semantics of the new method should probably be .... 
something less likely to be missunderstood then a long with two parts that you 
have to bitshift to extract?

i get the efficiency value in returning the perSegment docId for usecase that 
are already dealing with per-segment readers, but there's also a lot of 
super-simple cases (like the existing callers of getFirstMatch) that just want 
the (top level) docId and don't care about the segments, and i worry that two 
methods with similra names, but one that returns an "int" (global) docId and 
another that returns a "long" (bitshifted) segId+docId might lead plugin 
writers to confusing bugs (that could silently "work" on test indexes with only 
one segment)

Maybe a simple container object with easy to understand accessor methods like:
 * {{boolean exists()}}
 * {{int getDocId()}}
 * {{int getLeafContextId()}}
 * {{int getDocIdInLeafContext()}}

?

> SolrIndexSearcher.getFirstMatch trips assertion if multiple matches
> -------------------------------------------------------------------
>
>                 Key: SOLR-13851
>                 URL: https://issues.apache.org/jira/browse/SOLR-13851
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: Chris M. Hostetter
>            Priority: Major
>
> the documentation for {{SolrIndexSearcher.getFirstMatch}} says...
> {quote}
> Returns the first document number containing the term <code>t</code> Returns 
> -1 if no document was found. This method is primarily intended for clients 
> that want to fetch documents using a unique identifier."
> @return the first document number containing the term
> {quote}
> But SOLR-12366 refactored {{SolrIndexSearcher.getFirstMatch}} to eliminate 
> it's previous implementation and replace it with a call to (a refactored 
> version of) {{SolrIndexSearcher.lookupId}} -- but the code in {{lookupId}} 
> was always designed *explicitly* for dealing with a uniqueKey field, and has 
> an assertion that once it finds a match _there will be no other matches in 
> the index_
> This means that even though {{getFirstMatch}} is _intended_ for fields that 
> are unique between documents, i it's used on a field that is not unique, it 
> can trip an assertion.
> At a minimum we need to either "fix" {{getFirstMatch}} to behave as 
> documented, or fix it's documetation.
> Given that the current behavior has now been in place since Solr 7.4, and 
> given that all existing uses in "core" solr code are for looking up docs by 
> uniqueKey, it's probably best to simply fix the documentation, but we should 
> also consider replacing hte assertion with an IllegalStateException, or 
> SolrException -- anything not dependent on having assertions enabled -- to 
> prevent silent bugs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to