[ 
https://issues.apache.org/jira/browse/SOLR-7927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14697649#comment-14697649
 ] 

Yonik Seeley commented on SOLR-7927:
------------------------------------

bq. Yes, the document has a single large content field (10MB).

OK, so the temporary buffer allocated would grow to 40MB to encode as UTF8.  
Are you close enough to the limit that that would push it over?  A 100MB JSON 
file could also grow to 200MB in memory (single byte vs double byte strings).

Disabling the transaction log will lower some other memory usage as well (like 
the bucket list in version info, and field cache entries for looking up 
_version_, etc?)

> Transaction log consumes lot of memory when indexing large documents
> --------------------------------------------------------------------
>
>                 Key: SOLR-7927
>                 URL: https://issues.apache.org/jira/browse/SOLR-7927
>             Project: Solr
>          Issue Type: Bug
>          Components: update
>    Affects Versions: 5.2.1
>            Reporter: Shalin Shekhar Mangar
>             Fix For: Trunk, 5.4
>
>
> Solr is started with 1280M heap.
> ./bin/solr start -m 1280m
> Indexing a 100MB JSON file (using curl) containing large JSON documents from 
> project Gutenberg fails with OOM but indexing a 549M JSON file containing 
> small documents is indexed just fine.
> The same 100MB JSON file with the same heap size can be indexed just fine if 
> I disable the transaction log.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to