[jira] [Commented] (LUCENE-6842) No way to limit the fields cached in memory and leads to OOM when there are thousand of fields (thousands)

Uwe Schindler (JIRA) Mon, 19 Oct 2015 07:45:46 -0700

    [ 
https://issues.apache.org/jira/browse/LUCENE-6842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14963403#comment-14963403
 ]


Uwe Schindler commented on LUCENE-6842:
---------------------------------------

The problem we have here is that you don't give us enough information about how 
your index looks like. To me your explanations make me think that you have a 
large, single index and many customers are placing documents in it, each with a 
different set of fields. If this is the case, why not create an index for every 
customer with only the fields the customer needs? You can load and unload them 
as needed. If you need to search in many indexes (not all, of course), just use 
MultiReader.

If you really have so many fields in *all* documents - is it really needed to 
have so many fields? We are talking about full text search, so in an ideal 
world one would have only one single field to search on (google like) :-) Of 
course this is generally not so simple, because people want to search in parts 
of the document, but millions? This is normally something going up to 100 
different fields accross your whole corpus.

To answer your question: No you cannot lazy load term indexes.

> No way to limit the fields cached in memory and leads to OOM when there are 
> thousand of fields (thousands)
> ----------------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-6842
>                 URL: https://issues.apache.org/jira/browse/LUCENE-6842
>             Project: Lucene - Core
>          Issue Type: Bug
>          Components: core/search
>    Affects Versions: 4.6.1
>         Environment: Linux, openjdk 1.6.x
>            Reporter: Bala Kolla
>         Attachments: HistogramOfHeapUsage.png
>
>
> I am opening this defect to get some guidance on how to handle a case of 
> server running out of memory and it seems like it's something to do how we 
> index. But want to know if there is anyway to reduce the impact of this on 
> memory usage before we look into the way of reducing the number of fields. 
> Basically we have many thousands of fields being indexed and it's causing a 
> large amount of memory being used (25GB) and eventually leading to 
> application to hang and force us to restart every few minutes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (LUCENE-6842) No way to limit the fields cached in memory and leads to OOM when there are thousand of fields (thousands)

Reply via email to