[ 
https://issues.apache.org/jira/browse/LUCENE-5875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14092936#comment-14092936
 ] 

ASF subversion and git services commented on LUCENE-5875:
---------------------------------------------------------

Commit 1617318 from [~mikemccand] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1617318 ]

LUCENE-5875: reduce page size of backing packed array used by NodeHash when 
building an FST from 1B to 128M values

> Default page/block sizes in the FST package can cause OOMs
> ----------------------------------------------------------
>
>                 Key: LUCENE-5875
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5875
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: core/FSTs
>    Affects Versions: 4.9
>            Reporter: Christian Ziech
>            Priority: Minor
>
> We are building some fairly big FSTs (the biggest one having about 500M terms 
> with an average of 20 characters per term) and that works very well so far.
> The problem is just that we can use neither the "doShareSuffix" nor the 
> "doPackFST" option from the builder since both would cause us to get 
> exceptions. One beeing an OOM and the other an IllegalArgumentException for a 
> negative array size in ArrayUtil.
> The thing here is that we in theory still have far more than enough memory 
> available but it seems that java for some reason cannot allocate byte or long 
> arrays of the size the NodeHash needs (maybe fragmentation?).
> Reducing the constant in the NodeHash from 1<<30 to e.g. 27 seems to fix the 
> issue mostly. Could e.g. the Builder pass through its bytesPageBits to the 
> NodeHash or could we get a custom parameter for that?
> The other problem we run into was a NegativeArraySizeException when we try to 
> pack the FST. It seems that we overflowed to 0x80000000. Unfortunately I 
> accidentally overwrote that exception but I remember it was triggered by the 
> GrowableWriter for the inCounts in line 728 of the FST. If it helps I can try 
> to reproduce it.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to