[ https://issues.apache.org/jira/browse/LUCENE-5591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13984234#comment-13984234 ]
Shai Erera commented on LUCENE-5591: ------------------------------------ BTW, I started by adding {{ramBytesUsed()}} to {{DocValuesFieldUpdates}}, but that was way over estimated, especially when the number of updates is small. That's due to the buffers used by these classes, e.g. GrowableWriter with pageSize=1024. I don't think that the RAM representation should be used as an estimate, rather the avg-update size is closer to what will eventually be written to disk. > ReaderAndUpdates should create a proper IOContext when writing DV updates > ------------------------------------------------------------------------- > > Key: LUCENE-5591 > URL: https://issues.apache.org/jira/browse/LUCENE-5591 > Project: Lucene - Core > Issue Type: Improvement > Reporter: Shai Erera > Attachments: LUCENE-5591.patch > > > Today we pass IOContext.DEFAULT. If DV updates are used in conjunction w/ > NRTCachingDirectory, it means the latter will attempt to write the entire DV > field in its RAMDirectory, which could lead to OOM. > Would be good if we can build our own FlushInfo, estimating the number of > bytes we're about to write. I didn't see off hand a quick way to guesstimate > that - I thought to use the current DV's sizeInBytes as an approximation, but > I don't see a way to get it, not a direct way at least. > Maybe we can use the size of the in-memory updates to guesstimate that > amount? Something like {{sizeOfInMemUpdates * (maxDoc/numUpdatedDocs)}}? Is > it a too wild guess? -- This message was sent by Atlassian JIRA (v6.2#6252) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org