[jira] [Commented] (SOLR-13851) SolrIndexSearcher.getFirstMatch trips assertion if multiple matches
[ https://issues.apache.org/jira/browse/SOLR-13851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16956455#comment-16956455 ] David Smiley commented on SOLR-13851: - Sounds reasonable. WDYT [~yo...@apache.org]? I suspect you added these long ago. > SolrIndexSearcher.getFirstMatch trips assertion if multiple matches > --- > > Key: SOLR-13851 > URL: https://issues.apache.org/jira/browse/SOLR-13851 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Chris M. Hostetter >Priority: Major > > the documentation for {{SolrIndexSearcher.getFirstMatch}} says... > {quote} > Returns the first document number containing the term t Returns > -1 if no document was found. This method is primarily intended for clients > that want to fetch documents using a unique identifier." > @return the first document number containing the term > {quote} > But SOLR-12366 refactored {{SolrIndexSearcher.getFirstMatch}} to eliminate > it's previous implementation and replace it with a call to (a refactored > version of) {{SolrIndexSearcher.lookupId}} -- but the code in {{lookupId}} > was always designed *explicitly* for dealing with a uniqueKey field, and has > an assertion that once it finds a match _there will be no other matches in > the index_ > This means that even though {{getFirstMatch}} is _intended_ for fields that > are unique between documents, i it's used on a field that is not unique, it > can trip an assertion. > At a minimum we need to either "fix" {{getFirstMatch}} to behave as > documented, or fix it's documetation. > Given that the current behavior has now been in place since Solr 7.4, and > given that all existing uses in "core" solr code are for looking up docs by > uniqueKey, it's probably best to simply fix the documentation, but we should > also consider replacing hte assertion with an IllegalStateException, or > SolrException -- anything not dependent on having assertions enabled -- to > prevent silent bugs. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13851) SolrIndexSearcher.getFirstMatch trips assertion if multiple matches
[ https://issues.apache.org/jira/browse/SOLR-13851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16956343#comment-16956343 ] Chris M. Hostetter commented on SOLR-13851: --- {quote}And as you note, we always use the ID. Perhaps these might be renamed... {quote} Yeah, i think the key changes should be: * both existing methods should be deprecated in 8x and removed in master ** getFirstMatch should be documented to note it's current peculiar state re: assertions and non-unique field usage * a new method should replace them that takes in _only_ the BytesRef and fails with a (non-assert) error if: ** the current schema doesn't use a uniqueKey field ** more then 1 doc is found matching the specified BytesRef in the uniqueKey field The exact return value/semantics of the new method should probably be something less likely to be missunderstood then a long with two parts that you have to bitshift to extract? i get the efficiency value in returning the perSegment docId for usecase that are already dealing with per-segment readers, but there's also a lot of super-simple cases (like the existing callers of getFirstMatch) that just want the (top level) docId and don't care about the segments, and i worry that two methods with similra names, but one that returns an "int" (global) docId and another that returns a "long" (bitshifted) segId+docId might lead plugin writers to confusing bugs (that could silently "work" on test indexes with only one segment) Maybe a simple container object with easy to understand accessor methods like: * {{boolean exists()}} * {{int getDocId()}} * {{int getLeafContextId()}} * {{int getDocIdInLeafContext()}} ? > SolrIndexSearcher.getFirstMatch trips assertion if multiple matches > --- > > Key: SOLR-13851 > URL: https://issues.apache.org/jira/browse/SOLR-13851 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Chris M. Hostetter >Priority: Major > > the documentation for {{SolrIndexSearcher.getFirstMatch}} says... > {quote} > Returns the first document number containing the term t Returns > -1 if no document was found. This method is primarily intended for clients > that want to fetch documents using a unique identifier." > @return the first document number containing the term > {quote} > But SOLR-12366 refactored {{SolrIndexSearcher.getFirstMatch}} to eliminate > it's previous implementation and replace it with a call to (a refactored > version of) {{SolrIndexSearcher.lookupId}} -- but the code in {{lookupId}} > was always designed *explicitly* for dealing with a uniqueKey field, and has > an assertion that once it finds a match _there will be no other matches in > the index_ > This means that even though {{getFirstMatch}} is _intended_ for fields that > are unique between documents, i it's used on a field that is not unique, it > can trip an assertion. > At a minimum we need to either "fix" {{getFirstMatch}} to behave as > documented, or fix it's documetation. > Given that the current behavior has now been in place since Solr 7.4, and > given that all existing uses in "core" solr code are for looking up docs by > uniqueKey, it's probably best to simply fix the documentation, but we should > also consider replacing hte assertion with an IllegalStateException, or > SolrException -- anything not dependent on having assertions enabled -- to > prevent silent bugs. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13851) SolrIndexSearcher.getFirstMatch trips assertion if multiple matches
[ https://issues.apache.org/jira/browse/SOLR-13851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953893#comment-16953893 ] David Wayne Smiley commented on SOLR-13851: --- I agree with your analysis. That "assert" should be an IllegalStateException. And the implementation of getFirstMatch is fine; just update it's documentation. I dislike the names of these similar methods "getFirstMatch" and "lookupId" are basically doing the same thing yet the method names are so different. And as you note, we always use the ID. Perhaps these might be renamed to lookupDocIdByUniqueKey and lookupDocIdAsPairByUniqueKey and have them both simply take a BytesRef? It's okay to me that my proposed names are long-ish. > SolrIndexSearcher.getFirstMatch trips assertion if multiple matches > --- > > Key: SOLR-13851 > URL: https://issues.apache.org/jira/browse/SOLR-13851 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Chris M. Hostetter >Priority: Major > > the documentation for {{SolrIndexSearcher.getFirstMatch}} says... > {quote} > Returns the first document number containing the term t Returns > -1 if no document was found. This method is primarily intended for clients > that want to fetch documents using a unique identifier." > @return the first document number containing the term > {quote} > But SOLR-12366 refactored {{SolrIndexSearcher.getFirstMatch}} to eliminate > it's previous implementation and replace it with a call to (a refactored > version of) {{SolrIndexSearcher.lookupId}} -- but the code in {{lookupId}} > was always designed *explicitly* for dealing with a uniqueKey field, and has > an assertion that once it finds a match _there will be no other matches in > the index_ > This means that even though {{getFirstMatch}} is _intended_ for fields that > are unique between documents, i it's used on a field that is not unique, it > can trip an assertion. > At a minimum we need to either "fix" {{getFirstMatch}} to behave as > documented, or fix it's documetation. > Given that the current behavior has now been in place since Solr 7.4, and > given that all existing uses in "core" solr code are for looking up docs by > uniqueKey, it's probably best to simply fix the documentation, but we should > also consider replacing hte assertion with an IllegalStateException, or > SolrException -- anything not dependent on having assertions enabled -- to > prevent silent bugs. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13851) SolrIndexSearcher.getFirstMatch trips assertion if multiple matches
[ https://issues.apache.org/jira/browse/SOLR-13851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953885#comment-16953885 ] Chris M. Hostetter commented on SOLR-13851: --- /cc [~dsmiley] ... not sure if you have any specific thought/concerns on moving forward here based on the prior change. > SolrIndexSearcher.getFirstMatch trips assertion if multiple matches > --- > > Key: SOLR-13851 > URL: https://issues.apache.org/jira/browse/SOLR-13851 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Chris M. Hostetter >Priority: Major > > the documentation for {{SolrIndexSearcher.getFirstMatch}} says... > {quote} > Returns the first document number containing the term t Returns > -1 if no document was found. This method is primarily intended for clients > that want to fetch documents using a unique identifier." > @return the first document number containing the term > {quote} > But SOLR-12366 refactored {{SolrIndexSearcher.getFirstMatch}} to eliminate > it's previous implementation and replace it with a call to (a refactored > version of) {{SolrIndexSearcher.lookupId}} -- but the code in {{lookupId}} > was always designed *explicitly* for dealing with a uniqueKey field, and has > an assertion that once it finds a match _there will be no other matches in > the index_ > This means that even though {{getFirstMatch}} is _intended_ for fields that > are unique between documents, i it's used on a field that is not unique, it > can trip an assertion. > At a minimum we need to either "fix" {{getFirstMatch}} to behave as > documented, or fix it's documetation. > Given that the current behavior has now been in place since Solr 7.4, and > given that all existing uses in "core" solr code are for looking up docs by > uniqueKey, it's probably best to simply fix the documentation, but we should > also consider replacing hte assertion with an IllegalStateException, or > SolrException -- anything not dependent on having assertions enabled -- to > prevent silent bugs. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13851) SolrIndexSearcher.getFirstMatch trips assertion if multiple matches
[ https://issues.apache.org/jira/browse/SOLR-13851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953070#comment-16953070 ] Chris M. Hostetter commented on SOLR-13851: --- Background: I recently noticed jenkins test failures from TestCloudNestedDocsSort that stemmed from this assertion error... {noformat} [junit4] 2> Server ErrorCaused by:java.lang.AssertionError [junit4] 2>at org.apache.solr.search.SolrIndexSearcher.lookupId(SolrIndexSearcher.java:710) [junit4] 2>at org.apache.solr.search.SolrIndexSearcher.getFirstMatch(SolrIndexSearcher.java:676) [junit4] 2>at org.apache.solr.handler.component.QueryComponent.doProcessSearchByIds(QueryComponent.java:1266) [junit4] 2>at org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:351) [junit4] 2>at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:305) [junit4] 2>at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:198) [junit4] 2>at org.apache.solr.core.SolrCore.execute(SolrCore.java:2559) {noformat} At the core of the problem is that TestCloudNestedDocsSort does some things it shouldn't in terms of fhild doc uniqueKeys (which i'll track in a linked jira) ... but while using git bisect to identify when/where the failure was introduced, it identified GIT:1e63b32731bedf108aaeeb5d0a04d671f5663102 (SOLR-12366) as the first bad commit, and that's when i realized that prior to SOLR-12366 this (bad test) worked fine because {{getFirstMatch}} just did what it says: returned the first match (w/o complaining if there were multiples) > SolrIndexSearcher.getFirstMatch trips assertion if multiple matches > --- > > Key: SOLR-13851 > URL: https://issues.apache.org/jira/browse/SOLR-13851 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Chris M. Hostetter >Priority: Major > > the documentation for {{SolrIndexSearcher.getFirstMatch}} says... > {quote} > Returns the first document number containing the term t Returns > -1 if no document was found. This method is primarily intended for clients > that want to fetch documents using a unique identifier." > @return the first document number containing the term > {quote} > But SOLR-12366 refactored {{SolrIndexSearcher.getFirstMatch}} to eliminate > it's previous implementation and replace it with a call to (a refactored > version of) {{SolrIndexSearcher.lookupId}} -- but the code in {{lookupId}} > was always designed *explicitly* for dealing with a uniqueKey field, and has > an assertion that once it finds a match _there will be no other matches in > the index_ > This means that even though {{getFirstMatch}} is _intended_ for fields that > are unique between documents, i it's used on a field that is not unique, it > can trip an assertion. > At a minimum we need to either "fix" {{getFirstMatch}} to behave as > documented, or fix it's documetation. > Given that the current behavior has now been in place since Solr 7.4, and > given that all existing uses in "core" solr code are for looking up docs by > uniqueKey, it's probably best to simply fix the documentation, but we should > also consider replacing hte assertion with an IllegalStateException, or > SolrException -- anything not dependent on having assertions enabled -- to > prevent silent bugs. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org