[ https://issues.apache.org/jira/browse/LUCENE-5591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13984473#comment-13984473 ]
Michael McCandless commented on LUCENE-5591: -------------------------------------------- Looks great Shai, thanks! Is avgUpdateSize supposed to be "bytes per doc"? If so, instead of bitsPerValue, shouldn't we return bitsPerValue/8, maybe rounded up to the nearest byte? Should we rename the method ... maybe ramBytesPerDoc or something? Shouldn't BinaryDocValuesFieldUpdates.avgUpdateSize also include the docs/offsets/lengths RAM used too? Separately, I noticed BinaryDocValuesFieldUpdates's add method is doing a BytesRef.append of each added value ... isn't this slowish (O(N^2) where N = number of docs that have been updated)? BytesRef.append doesn't use ArrayUtil.grow to size the array on overflow... > ReaderAndUpdates should create a proper IOContext when writing DV updates > ------------------------------------------------------------------------- > > Key: LUCENE-5591 > URL: https://issues.apache.org/jira/browse/LUCENE-5591 > Project: Lucene - Core > Issue Type: Improvement > Reporter: Shai Erera > Attachments: LUCENE-5591.patch > > > Today we pass IOContext.DEFAULT. If DV updates are used in conjunction w/ > NRTCachingDirectory, it means the latter will attempt to write the entire DV > field in its RAMDirectory, which could lead to OOM. > Would be good if we can build our own FlushInfo, estimating the number of > bytes we're about to write. I didn't see off hand a quick way to guesstimate > that - I thought to use the current DV's sizeInBytes as an approximation, but > I don't see a way to get it, not a direct way at least. > Maybe we can use the size of the in-memory updates to guesstimate that > amount? Something like {{sizeOfInMemUpdates * (maxDoc/numUpdatedDocs)}}? Is > it a too wild guess? -- This message was sent by Atlassian JIRA (v6.2#6252) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org