[
https://issues.apache.org/jira/browse/LUCENE-5875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14092932#comment-14092932
]
Christian Ziech commented on LUCENE-5875:
-----------------------------------------
{quote}
Hmm, something else is wrong then ... or was this just an OOME? If not, can you
reproduce the non-OOME when turning on packing despite node count being well
below 2.1B?
{quote}
Sure - give me 1-2 days and I'll paste it here.
> Default page/block sizes in the FST package can cause OOMs
> ----------------------------------------------------------
>
> Key: LUCENE-5875
> URL: https://issues.apache.org/jira/browse/LUCENE-5875
> Project: Lucene - Core
> Issue Type: Improvement
> Components: core/FSTs
> Affects Versions: 4.9
> Reporter: Christian Ziech
> Priority: Minor
>
> We are building some fairly big FSTs (the biggest one having about 500M terms
> with an average of 20 characters per term) and that works very well so far.
> The problem is just that we can use neither the "doShareSuffix" nor the
> "doPackFST" option from the builder since both would cause us to get
> exceptions. One beeing an OOM and the other an IllegalArgumentException for a
> negative array size in ArrayUtil.
> The thing here is that we in theory still have far more than enough memory
> available but it seems that java for some reason cannot allocate byte or long
> arrays of the size the NodeHash needs (maybe fragmentation?).
> Reducing the constant in the NodeHash from 1<<30 to e.g. 27 seems to fix the
> issue mostly. Could e.g. the Builder pass through its bytesPageBits to the
> NodeHash or could we get a custom parameter for that?
> The other problem we run into was a NegativeArraySizeException when we try to
> pack the FST. It seems that we overflowed to 0x80000000. Unfortunately I
> accidentally overwrote that exception but I remember it was triggered by the
> GrowableWriter for the inCounts in line 728 of the FST. If it helps I can try
> to reproduce it.
--
This message was sent by Atlassian JIRA
(v6.2#6252)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]