Don't you think it's worth to raise a jira regarding those 'new bytes[]' ? I'm able to provide a patch if you wish.
On Wed, Jan 8, 2014 at 2:02 PM, Mikhail Khludnev <mkhlud...@griddynamics.com > wrote: > FWIW, > > Micro benchmark shows 4% gain on reusing incoming ByteRef.bytes in short > binary docvalues Test2BBinaryDocValues.testVariableBinary() with mmap > directory. > I wonder why it doesn't reads into incoming bytes > https://github.com/apache/lucene-solr/blame/trunk/lucene/core/src/java/org/apache/lucene/codecs/lucene45/Lucene45DocValuesProducer.java#L401 > > > > On Wed, Jan 8, 2014 at 12:53 AM, Michael McCandless < > luc...@mikemccandless.com> wrote: > >> Going sequentially should help, if the pages are not hot (in the OS's IO >> cache). >> >> You can also use a different DVFormat, e.g. Direct, but this holds all >> bytes in RAM. >> >> Mike McCandless >> >> http://blog.mikemccandless.com >> >> >> On Tue, Jan 7, 2014 at 1:09 PM, Mikhail Khludnev >> <mkhlud...@griddynamics.com> wrote: >> > Joel, >> > >> > I tried to hack it straightforwardly, but found no free gain there. The >> only >> > attempt I can suggest is to try to reuse bytes in >> > >> https://github.com/apache/lucene-solr/blame/trunk/lucene/core/src/java/org/apache/lucene/codecs/lucene45/Lucene45DocValuesProducer.java#L401 >> > right now it allocates bytes every time, which beside of GC can also >> impact >> > memory access locality. Could you try fix memory waste and repeat >> > performance test? >> > >> > Have a good hack! >> > >> > >> > On Mon, Dec 23, 2013 at 9:51 PM, Joel Bernstein <joels...@gmail.com> >> wrote: >> >> >> >> >> >> Hi, >> >> >> >> I'm looking for a faster way to perform large scale docId -> bytesRef >> >> lookups for BinaryDocValues. >> >> >> >> I'm finding that I can't get the performance that I need from the >> random >> >> access seek in the BinaryDocValues interface. >> >> >> >> I'm wondering if sequentially scanning the docValues would be a faster >> >> approach. I have a BitSet of matching docs, so if I sequentially moved >> >> through the docValues I could test each one against that bitset. >> >> >> >> Wondering if that approach would be faster for bulk extracts and how >> >> tricky it would be to add an iterator to the BinaryDocValues interface? >> >> >> >> Thanks, >> >> Joel >> > >> > >> > >> > >> > -- >> > Sincerely yours >> > Mikhail Khludnev >> > Principal Engineer, >> > Grid Dynamics >> > >> > >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org >> For additional commands, e-mail: dev-h...@lucene.apache.org >> >> > > > -- > Sincerely yours > Mikhail Khludnev > Principal Engineer, > Grid Dynamics > > <http://www.griddynamics.com> > <mkhlud...@griddynamics.com> > -- Sincerely yours Mikhail Khludnev Principal Engineer, Grid Dynamics <http://www.griddynamics.com> <mkhlud...@griddynamics.com>