@David I will certainly update when we get the data refed... and if you
have things you'd like to investigate or try out please let me know.. I'm
happy to eval things at scale here... we will be taking this index from its
current 45m records to 6-700m over the next few months as well..

steve


On Tue, Jul 30, 2013 at 5:10 PM, Steven Bower <sbo...@alcyon.net> wrote:

> Very good read... Already using MMap... verified using pmap and vsz from
> top..
>
> not sure what you mean by good hit raitio?
>
> Here are the stacks...
>
>    Name Time (ms) Own Time (ms)
> org.apache.lucene.search.MultiTermQueryWrapperFilter.getDocIdSet(AtomicReaderContext,
> Bits) 300879 203478
> org.apache.lucene.codecs.lucene41.Lucene41PostingsReader$BlockDocsEnum.nextDoc()
> 45539 19
> org.apache.lucene.codecs.lucene41.Lucene41PostingsReader$BlockDocsEnum.refillDocs()
> 45519 40
> org.apache.lucene.codecs.lucene41.Lucene41PostingsReader.readVIntBlock(IndexInput,
> int[], int[], int, boolean) 24352 0
> org.apache.lucene.store.DataInput.readVInt() 24352 24352
> org.apache.lucene.codecs.lucene41.ForUtil.readBlock(IndexInput, byte[],
> int[]) 21126 14976
> org.apache.lucene.store.ByteBufferIndexInput.readBytes(byte[], int, int)
> 6150 0              java.nio.DirectByteBuffer.get(byte[], int, int) 6150 0
> java.nio.Bits.copyToArray(long, Object, long, long, long) 6150 6150
> org.apache.lucene.codecs.BlockTreeTermsReader$FieldReader$SegmentTermsEnum.docs(Bits,
> DocsEnum, int) 35342 421
> org.apache.lucene.codecs.BlockTreeTermsReader$FieldReader$SegmentTermsEnum$Frame.decodeMetaData()
> 34920 27939
> org.apache.lucene.codecs.lucene41.Lucene41PostingsReader.nextTerm(FieldInfo,
> BlockTermState) 6980 6980
> org.apache.lucene.codecs.BlockTreeTermsReader$FieldReader$SegmentTermsEnum.next()
> 14129 1053
> org.apache.lucene.codecs.BlockTreeTermsReader$FieldReader$SegmentTermsEnum$Frame.loadNextFloorBlock()
> 5948 261
> org.apache.lucene.codecs.BlockTreeTermsReader$FieldReader$SegmentTermsEnum$Frame.loadBlock()
> 5686 199
> org.apache.lucene.store.ByteBufferIndexInput.readBytes(byte[], int, int)
> 3606 0              java.nio.DirectByteBuffer.get(byte[], int, int) 3606 0
> java.nio.Bits.copyToArray(long, Object, long, long, long) 3606 3606
> org.apache.lucene.codecs.lucene41.Lucene41PostingsReader.readTermsBlock(IndexInput,
> FieldInfo, BlockTermState) 1879 80
> org.apache.lucene.store.ByteBufferIndexInput.readBytes(byte[], int, int)
> 1798 0                java.nio.DirectByteBuffer.get(byte[], int, int) 1798
> 0                  java.nio.Bits.copyToArray(long, Object, long, long,
> long) 1798 1798
> org.apache.lucene.codecs.BlockTreeTermsReader$FieldReader$SegmentTermsEnum$Frame.next()
> 4010 3324
> org.apache.lucene.codecs.BlockTreeTermsReader$FieldReader$SegmentTermsEnum$Frame.nextNonLeaf()
> 685 685
> org.apache.lucene.codecs.BlockTreeTermsReader$FieldReader$SegmentTermsEnum$Frame.loadBlock()
> 3117 144
> org.apache.lucene.store.ByteBufferIndexInput.readBytes(byte[], int, int)
> 1861 0            java.nio.DirectByteBuffer.get(byte[], int, int) 1861 0
> java.nio.Bits.copyToArray(long, Object, long, long, long) 1861 1861
> org.apache.lucene.codecs.lucene41.Lucene41PostingsReader.readTermsBlock(IndexInput,
> FieldInfo, BlockTermState) 1090 19
> org.apache.lucene.store.ByteBufferIndexInput.readBytes(byte[], int, int)
> 1070 0              java.nio.DirectByteBuffer.get(byte[], int, int) 1070 0
> java.nio.Bits.copyToArray(long, Object, long, long, long) 1070 1070
> org.apache.lucene.codecs.BlockTreeTermsReader$FieldReader$SegmentTermsEnum.initIndexInput()
> 20 0            org.apache.lucene.store.ByteBufferIndexInput.clone() 20 0
> org.apache.lucene.store.ByteBufferIndexInput.clone() 20 0
> org.apache.lucene.store.ByteBufferIndexInput.buildSlice(long, long) 20 0
> org.apache.lucene.util.WeakIdentityMap.put(Object, Object) 20 0
> org.apache.lucene.util.WeakIdentityMap$IdentityWeakReference.<init>(Object,
> ReferenceQueue) 20 0
> java.lang.System.identityHashCode(Object) 20 20
> org.apache.lucene.index.FilteredTermsEnum.docs(Bits, DocsEnum, int) 1485
> 527
> org.apache.lucene.codecs.BlockTreeTermsReader$FieldReader$SegmentTermsEnum.docs(Bits,
> DocsEnum, int) 957 0
> org.apache.lucene.codecs.BlockTreeTermsReader$FieldReader$SegmentTermsEnum$Frame.decodeMetaData()
> 957 513
> org.apache.lucene.codecs.lucene41.Lucene41PostingsReader.nextTerm(FieldInfo,
> BlockTermState) 443 443
> org.apache.lucene.index.FilteredTermsEnum.next() 874 324
> org.apache.lucene.search.NumericRangeQuery$NumericRangeTermsEnum.accept(BytesRef)
> 368 0
> org.apache.lucene.util.BytesRef$UTF8SortedAsUnicodeComparator.compare(Object,
> Object) 368 368
> org.apache.lucene.codecs.BlockTreeTermsReader$FieldReader$SegmentTermsEnum.next()
> 160 0
> org.apache.lucene.codecs.BlockTreeTermsReader$FieldReader$SegmentTermsEnum$Frame.loadNextFloorBlock()
> 160 0
> org.apache.lucene.codecs.BlockTreeTermsReader$FieldReader$SegmentTermsEnum$Frame.loadBlock()
> 160 0
> org.apache.lucene.store.ByteBufferIndexInput.readBytes(byte[], int, int)
> 120 0
> org.apache.lucene.codecs.lucene41.Lucene41PostingsReader.readTermsBlock(IndexInput,
> FieldInfo, BlockTermState) 39 0
> org.apache.lucene.codecs.BlockTreeTermsReader$FieldReader$SegmentTermsEnum.seekCeil(BytesRef,
> boolean) 19 0
> org.apache.lucene.codecs.BlockTreeTermsReader$FieldReader$SegmentTermsEnum$Frame.loadBlock()
> 19 0
> org.apache.lucene.codecs.BlockTreeTermsReader$FieldReader$SegmentTermsEnum.initIndexInput()
> 19 0              org.apache.lucene.store.ByteBufferIndexInput.clone() 19
> 0                org.apache.lucene.store.ByteBufferIndexInput.clone() 19 0
> org.apache.lucene.store.ByteBufferIndexInput.buildSlice(long, long) 19 0
> org.apache.lucene.util.WeakIdentityMap.put(Object, Object) 19 0
> org.apache.lucene.util.WeakIdentityMap$IdentityWeakReference.<init>(Object,
> ReferenceQueue) 19 0
> java.lang.System.identityHashCode(Object) 19 19
> org.apache.lucene.util.FixedBitSet.<init>(int) 28 28
>
>
> On Tue, Jul 30, 2013 at 4:18 PM, Mikhail Khludnev <
> mkhlud...@griddynamics.com> wrote:
>
>> On Tue, Jul 30, 2013 at 12:45 AM, Steven Bower <smb-apa...@alcyon.net
>> >wrote:
>>
>> >
>> > - Most of my time (98%) is being spent in
>> > java.nio.Bits.copyToByteArray(long,Object,long,long) which is being
>>
>>
>> Steven, please
>> http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html.my
>> benchmarking experience shows that NIO is a turtle, absolutely.
>>
>> also, are you sure that fq=(vid:86XXX73 OR vid:86XXX20 ..... has good hit
>> ratio? otherwise it's a  well known beast.
>>
>> could you also show deeper stack, to make sure what causes to excessive
>> reading?
>>
>>
>>
>> --
>> Sincerely yours
>> Mikhail Khludnev
>> Principal Engineer,
>> Grid Dynamics
>>
>> <http://www.griddynamics.com>
>>  <mkhlud...@griddynamics.com>
>>
>
>

Reply via email to