Michael Gibney created LUCENE-8610:
--------------------------------------

             Summary: NPE in TermsHashPerField.add() for TokenStreams with 
lazily instantiated token Attributes
                 Key: LUCENE-8610
                 URL: https://issues.apache.org/jira/browse/LUCENE-8610
             Project: Lucene - Core
          Issue Type: Bug
          Components: core/index
    Affects Versions: 7.4, master (8.0)
            Reporter: Michael Gibney


{{DefaultIndexingChain.invert(...)}} callsĀ 
{{invertState.setAttributeSource(stream)}} before {{stream.incrementToken()}} 
is called.

For instances of {{stream}} that lazily instantiate token attributes (e.g., as 
{{solr.PreAnalyzedField$PreAnalyzedTokenizer}} does upon the first call to 
{{incrementToken()}} that returns {{true}}), this can result in caching a 
{{null}} value in {{invertState.termAttribute}} for a given {{stream}} 
instance. 

Subsequent calls that reuse the same {{stream}} instance (reusing 
{{TokenStreamComponents}}) for field values with at least 1 token will call 
{{termHashPerField.start(...)}} which sets {{termsHashPerField.termAtt}} from 
the {{null}} value cached in the {{FieldInvertState.termAttribute}}. An NPE is 
thrown when {{termsHashPerField.add()}} reasonably but incorrectly assumes a 
non-null value for {{termAtt}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to