oleewere opened a new pull request #52: Limit max length of tokens (Solr string)
URL: https://github.com/apache/ambari-logsearch/pull/52
 
 
   # What changes were proposed in this pull request?
   Limit max length of tokenized strings (it is not normal is a tokenized 
string is too large, but at least filter out them because following can happen 
on solr side:
   ```bash
   Error: {
     "responseHeader":{
       "status":400,
       "QTime":5},
     "error":{
       "metadata":[
         "error-class","org.apache.solr.common.SolrException",
         
"root-error-class","org.apache.lucene.util.BytesRefHash$MaxBytesLengthExceededException"],
       "msg":"Exception writing document id 
30b6c195-e29b-4ab9-9583-511b3e798461 to the index; possible analysis error: 
Document contains at least one immense term in field=\"key_log_message\" (whose 
UTF8 encoding is longer than the max length 32766), all of which were skipped.  
Please correct the analyzer to not produce such terms.  The prefix of the first 
immense term is: '[91, 111, 119, 110, 101, 114, 58, 51, 52, 57, 57, 98, 57, 
100, 50, 45, 101, 97, 51, 97, 45, 52, 101, 54, 56, 45, 57, 99, 52, 53]...', 
original message: bytes can be at most 32766 in length; got 63613. Perhaps the 
document has an indexed string field (solr.StrField) which is too large",
       "code":400}}
   ```
   ## How was this patch tested?
   docker env
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to