On Mon, Mar 8, 2010 at 1:18 PM, Christopher Laux <ctl...@googlemail.com> wrote:
> I'm not sure if this is the right list, as it's sort of a development > question too, but I don't want to bother them over there. Anyway, I'm > curious as to the reason for using "manual memory management" a la > ByteBlockPool and consorts in Java. Is it for performance reasons > alone, to avoid the allocation and garbage collection of many small > objects or is there some residue of C-style thinking in the early > years? This was done for performance (to remove alloc/init/GC load). There are two parts to it -- first, consolidating what used to be lots of little objects into shared byte[]/int[] blocks. Second, reusing those blocks. I think the biggest perf gains were from the first (consolidating tiny objs together), but we probably still have some gains from the second. A simple test would be to change the pools to not re-use and then measure indexing throughput. > Even then, shouldn't there be a more Java-ish solution using the > existing streams classes? Would that be the way to go if one started > over? I realize this is not very realistic, I'm asking out of > curiosity. Actually that's how Lucene used to work, and then (in 2.3 I think) we cutover to the current reused blocks ram writing. If we were to start over I don't think I'd change much over where we are now, at least on this aspect of Lucene. There are plenty of other things I'd change ;) But... one can always make a custom indexing chain (it's a package private API now, but possible) to do something totally different. EG I think a chain dedicated to inverting tiny docs could show sizable gains over the default chain Lucene uses> today. Mike --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org