[ https://issues.apache.org/jira/browse/LUCENE-6779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Shalin Shekhar Mangar updated LUCENE-6779: ------------------------------------------ Attachment: LUCENE-6779.patch Patch with the changes. > Reduce memory allocated by CompressingStoredFieldsWriter to write large > strings > ------------------------------------------------------------------------------- > > Key: LUCENE-6779 > URL: https://issues.apache.org/jira/browse/LUCENE-6779 > Project: Lucene - Core > Issue Type: Improvement > Components: core/codecs > Reporter: Shalin Shekhar Mangar > Attachments: LUCENE-6779.patch > > > In SOLR-7927, I am trying to reduce the memory required to index very large > documents (between 10 to 100MB) and one of the places which allocate a lot of > heap is the UTF8 encoding in CompressingStoredFieldsWriter. The same problem > existed in JavaBinCodec and we reduced its memory allocation by falling back > to a double pass approach in SOLR-7971 when the utf8 size of the string is > greater than 64KB. > I propose to make the same changes to CompressingStoredFieldsWriter as we > made to JavaBinCodec in SOLR-7971. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org