RE: Re: Performance StringCoding.decode

Uwe Schindler Wed, 06 Aug 2014 01:52:04 -0700

Hi,

It looks like you are fetching the stored fields of *all* search results. In 
general, Lucene is made to return the most relevant documents to the user. 
Fetching stored fields is then done only for like the 10 top-ranking results. 
If you do this for all results (which can be thousands), this is of course a 
performance problem: the stored fields are compressed on disk and after 
decompression the bytes have to be converted to UTF-16 Java Strings. There is 
not much, Lucene can do.


If you use stored fields for ranking purposes (inside function queries), you 
should change them to numeric docvalues fields.

Uwe

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: [email protected]

> -----Original Message-----
> From: Sascha Janz [mailto:[email protected]]
> Sent: Wednesday, August 06, 2014 10:27 AM
> To: [email protected]
> Subject: Aw: Re: Performance StringCoding.decode
> 
> i used JMC ( Java Mission Control) from jdk7 u40+
> 
> 
> see here
> 
> 
> http://www.oracle.com/technetwork/java/javase/2col/jmc-relnotes-
> 2004763.html
> 
> 
> 
> Gesendet: Dienstag, 05. August 2014 um 17:41 Uhr
> Von: "[email protected]" <[email protected]>
> An: "[email protected]" <[email protected]>
> Betreff: Re: Performance StringCoding.decode how to monitor? use jprofile?
> 
> 
> 
> 
> 
> From: Sascha Janz
> Date: 2014-08-05 22:36
> To: [email protected]
> Subject: Performance StringCoding.decode hi,
> 
> i want to speed up our search performance. so i run test and monitor them
> with java mission control.
> 
> the analysis showed that one hotspot is
> 
> 
> sun.nio.cs.UTF_8$Decoder.decode(byte[], int, int, char[])
> - java.lang.StringCoding.decode(Charset, byte[], int, int)
> - java.lang.String.<init>(byte[], int, int, Charset) -
> org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.rea
> dField(DataInput,
> StoredFieldVisitor, FieldInfo, int)
> -
> org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.visi
> tDocument(int,
> StoredFieldVisitor)
> -org.apache.lucene.index.SegmentReader.document(int, StoredFieldVisitor)
> -org.apache.lucene.index.IndexReader.document(int, Set)
> 
> we use jdk 1.7.55 and lucene 4.9.0.
> 
> is there a chance to speed this up? or do some changes in lucene
> IndexWriterConfig, e.g. use an other codec?
> 
> we use the default values of IndexWriterConfig
> 
> 
> regards
> sascha
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
> 
> ----------------------------------------------------------------------------------------------
> -----
> Confidentiality Notice: The information contained in this e-mail and any
> accompanying attachment(s) is intended only for the use of the intended
> recipient and may be confidential and/or privileged of Neusoft Corporation,
> its subsidiaries and/or its affiliates. If any reader of this communication 
> is not
> the intended recipient, unauthorized use, forwarding, printing, storing,
> disclosure or copying is strictly prohibited, and may be unlawful.If you have
> received this communication in error,please immediately notify the sender
> by return e-mail, and delete the original message and all copies from your
> system. Thank you.
> ----------------------------------------------------------------------------------------------
> -----
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

RE: Re: Performance StringCoding.decode

Reply via email to