[ https://issues.apache.org/jira/browse/LUCENE-6842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14963420#comment-14963420 ]
Jack Krupansky commented on LUCENE-6842: ---------------------------------------- Generally, Lucene has few hard limits, but the general guidance is that ultimately you will be limited by available system resources such as RAM and CPU. There may not be any hard limit to the number of fields, but that doesn't mean that you can safely assume that a large number of fields will always work for a limited amount of RAM and CPU. Exactly how much RAM and CPU you need will depend on your specific application, that you yourself will have to test for - known as a proof of concept. Generally, people have resource problems based on the number of documents rather than the number of fields for each document. You haven't detailed how many documents you are indexing and how many of these fields are actually present in an average document. Who knows, maybe the number of fields is not the problem per se and it is the number of documents that is the cause of the resource issue, or a combination of the two. That said, I will defer to the more senior Lucene committers here, but personally I would suggest that "hundreds" or "low thousands" is a more practical recommended best practice upper limit to total number of fields in a Lucene index. Generally, "dozens" or at most "low hundreds" would be most recommended and the safest assumption. Sure, maybe 10,000 fields might actually work, but then number of documents and operations and query complexity will also come into play. All of that said, I'm sure we are all intently curious why exactly you feel that you need so many fields. > No way to limit the fields cached in memory and leads to OOM when there are > thousand of fields (thousands) > ---------------------------------------------------------------------------------------------------------- > > Key: LUCENE-6842 > URL: https://issues.apache.org/jira/browse/LUCENE-6842 > Project: Lucene - Core > Issue Type: Bug > Components: core/search > Affects Versions: 4.6.1 > Environment: Linux, openjdk 1.6.x > Reporter: Bala Kolla > Attachments: HistogramOfHeapUsage.png > > > I am opening this defect to get some guidance on how to handle a case of > server running out of memory and it seems like it's something to do how we > index. But want to know if there is anyway to reduce the impact of this on > memory usage before we look into the way of reducing the number of fields. > Basically we have many thousands of fields being indexed and it's causing a > large amount of memory being used (25GB) and eventually leading to > application to hang and force us to restart every few minutes. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org