[jira] [Commented] (LUCENE-6322) IndexSearcher.doc(int docID, SetfieldsToLoad) is slower in Lucene 4.9 when compared to Lucene 2.9

Stanislav Palatnik (JIRA) Sun, 01 May 2016 13:06:55 -0700

    [ 
https://issues.apache.org/jira/browse/LUCENE-6322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15265919#comment-15265919
 ]


Stanislav Palatnik commented on LUCENE-6322:
--------------------------------------------

Is there an alternative 4.x codec that does not use 
CompressingStoredFieldsFormat?

> IndexSearcher.doc(int docID, SetfieldsToLoad)  is slower in Lucene 4.9 when 
> compared to Lucene 2.9
> --------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-6322
>                 URL: https://issues.apache.org/jira/browse/LUCENE-6322
>             Project: Lucene - Core
>          Issue Type: Bug
>          Components: core/codecs
>    Affects Versions: 4.9
>         Environment: Windows, JDK 7/8
>            Reporter: Sekhar
>             Fix For: 4.10.5
>
>
> We use IndexSearcher.doc(int docID, SetfieldsToLoad) method to get the 
> document with selected stored fields. If we did not mention few stored fields 
> which have data more than 500KB, this call is slower in Lucene 4.9 when 
> compared to Lucene 2.9.
> I debugged the above method with Lucene 4.9 and found that 
> CompressingStoredFieldsReader#visitDocument(int docID, StoredFieldVisitor 
> visitor) is spending more time while loading file content and decompressing 
> in chunks of 16kb, even to skip the fields. It is noticeable degrade if the 
> document's field size is more than 1MB, and we call this method in loop for 
> more than 1000 such documents.
> In case of Lucene 2.9, there was no compression, and if we want to skip the 
> field, it just does file seek to set the next pointer to read the stored 
> field. For example see Lucene3xStoredFieldsReader#skipField() method how it 
> works for skipping a field in Lucene 2.9 which is VERY faster compared to 
> Lucene 4.9.
> We should have something in CompressingStoredFieldsReader to know the field’s 
> compressed length in file and just do the file seek to set the next pointer 
> instead of loading content from file and decompress that in 16KB chunks to 
> just skip the field from the file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (LUCENE-6322) IndexSearcher.doc(int docID, SetfieldsToLoad) is slower in Lucene 4.9 when compared to Lucene 2.9

Reply via email to