Michael Gibney created LUCENE-8610:
--------------------------------------
Summary: NPE in TermsHashPerField.add() for TokenStreams with
lazily instantiated token Attributes
Key: LUCENE-8610
URL: https://issues.apache.org/jira/browse/LUCENE-8610
Project: Lucene - Core
Issue Type: Bug
Components: core/index
Affects Versions: 7.4, master (8.0)
Reporter: Michael Gibney
{{DefaultIndexingChain.invert(...)}} callsĀ
{{invertState.setAttributeSource(stream)}} before {{stream.incrementToken()}}
is called.
For instances of {{stream}} that lazily instantiate token attributes (e.g., as
{{solr.PreAnalyzedField$PreAnalyzedTokenizer}} does upon the first call to
{{incrementToken()}} that returns {{true}}), this can result in caching a
{{null}} value in {{invertState.termAttribute}} for a given {{stream}}
instance.
Subsequent calls that reuse the same {{stream}} instance (reusing
{{TokenStreamComponents}}) for field values with at least 1 token will call
{{termHashPerField.start(...)}} which sets {{termsHashPerField.termAtt}} from
the {{null}} value cached in the {{FieldInvertState.termAttribute}}. An NPE is
thrown when {{termsHashPerField.add()}} reasonably but incorrectly assumes a
non-null value for {{termAtt}}.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]