[jira] [Updated] (METRON-1567) Large error message can't be written in Solr
[ https://issues.apache.org/jira/browse/METRON-1567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Justin Leet updated METRON-1567: Fix Version/s: 0.6.0 > Large error message can't be written in Solr > > > Key: METRON-1567 > URL: https://issues.apache.org/jira/browse/METRON-1567 > Project: Metron > Issue Type: Sub-task >Reporter: Justin Leet >Assignee: Justin Leet >Priority: Major > Fix For: 0.6.0 > > > Error message on the feature branch: > {code:java} > org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error > from server at > http://ip-11-0-1-51.us-west-2.compute.internal:8983/solr/error: Exception > writing document id cd6db5c1-f41b-4dcf-8f68-583c7fc08575 to the index; > possible analysis error: Document contains at least one immense term in > field="raw_message_1" (whose UTF8 encoding is longer than the max length > 32766), all of which were skipped. Please correct the analyzer to not produce > such terms. The prefix of the first immense term is: '[123, 34, 101, 120, 99, > 101, 112, 116, 105, 111, 110, 34, 58, 34, 106, 97, 118, 97, 46, 105, 111, 46, > 70, 105, 108, 101, 78, 111, 116, 70]...', original message: bytes can be at > most 32766 in length; got 165866. Perhaps the document has an indexed string > field (solr.StrField) which is too large > at > org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:612) > ~[stormjar.jar:?] > ...{code} > This is a hard limit of string fields, per > https://lucene.apache.org/solr/guide/6_6/field-types-included-with-solr.html > It also mentions they aren't tokenized or analyzed, so it doesn't seem like > we'd be able to turn this limit off. > Text fields don't list any sort of limit (although they may still have one), > so we may want to switch to that, but it would require testing. > Additionally, it appears that raw_message is dynamic (since it's getting _1, > but we don't define it in the schema). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (METRON-1567) Large error message can't be written in Solr
[ https://issues.apache.org/jira/browse/METRON-1567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Justin Leet updated METRON-1567: Summary: Large error message can't be written in Solr (was: Large error message can't be written) > Large error message can't be written in Solr > > > Key: METRON-1567 > URL: https://issues.apache.org/jira/browse/METRON-1567 > Project: Metron > Issue Type: Sub-task >Reporter: Justin Leet >Assignee: Justin Leet >Priority: Major > > Error message on the feature branch: > {code:java} > org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error > from server at > http://ip-11-0-1-51.us-west-2.compute.internal:8983/solr/error: Exception > writing document id cd6db5c1-f41b-4dcf-8f68-583c7fc08575 to the index; > possible analysis error: Document contains at least one immense term in > field="raw_message_1" (whose UTF8 encoding is longer than the max length > 32766), all of which were skipped. Please correct the analyzer to not produce > such terms. The prefix of the first immense term is: '[123, 34, 101, 120, 99, > 101, 112, 116, 105, 111, 110, 34, 58, 34, 106, 97, 118, 97, 46, 105, 111, 46, > 70, 105, 108, 101, 78, 111, 116, 70]...', original message: bytes can be at > most 32766 in length; got 165866. Perhaps the document has an indexed string > field (solr.StrField) which is too large > at > org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:612) > ~[stormjar.jar:?] > ...{code} > This is a hard limit of string fields, per > https://lucene.apache.org/solr/guide/6_6/field-types-included-with-solr.html > It also mentions they aren't tokenized or analyzed, so it doesn't seem like > we'd be able to turn this limit off. > Text fields don't list any sort of limit (although they may still have one), > so we may want to switch to that, but it would require testing. > Additionally, it appears that raw_message is dynamic (since it's getting _1, > but we don't define it in the schema). -- This message was sent by Atlassian JIRA (v7.6.3#76005)