[ 
https://issues.apache.org/jira/browse/NIFI-2787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15507466#comment-15507466
 ] 

Michael Moser commented on NIFI-2787:
-------------------------------------

In Nifi 1.1.0-SNAPSHOT, I wrote a unit test to make this manifest as

[Index Provenance Events] ERROR 
org.apache.nifi.provenance.PersistentProvenanceRepository - Failed to index 
Provenance Event for 
target/storage/722797da-d510-4cee-b1ed-cd497df6052a/0.prov.gz to 
target/storage/722797da-d510-4cee-b1ed-cd497df6052a/index-1474396176476
java.lang.IllegalArgumentException: Document contains at least one immense term 
in field="immense" (whose UTF8 encoding is longer than the max length 32766), 
all of which were skipped.  Please correct the analyzer to not produce such 
terms.  The prefix of the first immense term is: <junk>, original message: 
bytes can be at most 32766 in length; got 36000
        at <snip>
        at 
org.apache.nifi.provenance.lucene.IndexingAction.index(IndexingAction.java:126)
        at 
org.apache.nifi.provenance.PersistentProvenanceRepository$12.call(PersistentProvenanceRepository.java:1742)


> PersistentProvenanceRepository rollover can fail on immense indexed attributes
> ------------------------------------------------------------------------------
>
>                 Key: NIFI-2787
>                 URL: https://issues.apache.org/jira/browse/NIFI-2787
>             Project: Apache NiFi
>          Issue Type: Bug
>          Components: Core Framework
>    Affects Versions: 1.0.0, 0.7.0
>            Reporter: Michael Moser
>
> Accidentally created an immense attribute (36,000 bytes), which I indexed 
> with nifi.provenance.repository.indexed.attributes.  Received this error.
> ERROR [Provenance Repository Rollover Thread-1] 
> o.a.n.p.PersistentProvenanceRepository Failed to rollover Provenance 
> repository due to java.lang.IllegalArgumentException: Document contains at 
> least one immense term in field="FOO" (whose UTF8 encoding is longer than the 
> max length 32766), all of which were skipped. Please correct the analyzer to 
> not produce such terms.
> Perhaps this is as simple as changing 
> https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-provenance-repository-bundle/nifi-persistent-provenance-repository/src/main/java/org/apache/nifi/provenance/RepositoryConfiguration.java#L37
>  to 32766 to match Lucene.  Investigation & testing needed.
> For background, this Lucene ticket made exceeding the term size limit an 
> IllegalArgumentException https://issues.apache.org/jira/browse/LUCENE-5472



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to