oleewere opened a new pull request #52: Limit max length of tokens (Solr string) URL: https://github.com/apache/ambari-logsearch/pull/52 # What changes were proposed in this pull request? Limit max length of tokenized strings (it is not normal is a tokenized string is too large, but at least filter out them because following can happen on solr side: ```bash Error: { "responseHeader":{ "status":400, "QTime":5}, "error":{ "metadata":[ "error-class","org.apache.solr.common.SolrException", "root-error-class","org.apache.lucene.util.BytesRefHash$MaxBytesLengthExceededException"], "msg":"Exception writing document id 30b6c195-e29b-4ab9-9583-511b3e798461 to the index; possible analysis error: Document contains at least one immense term in field=\"key_log_message\" (whose UTF8 encoding is longer than the max length 32766), all of which were skipped. Please correct the analyzer to not produce such terms. The prefix of the first immense term is: '[91, 111, 119, 110, 101, 114, 58, 51, 52, 57, 57, 98, 57, 100, 50, 45, 101, 97, 51, 97, 45, 52, 101, 54, 56, 45, 57, 99, 52, 53]...', original message: bytes can be at most 32766 in length; got 63613. Perhaps the document has an indexed string field (solr.StrField) which is too large", "code":400}} ``` ## How was this patch tested? docker env
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services