[jira] [Commented] (LUCENE-3638) IndexReader.document always return a doc with all the stored fields loaded. And this can be slow for the indexed document contain huge fields

Robert Muir (Commented) (JIRA) Sun, 11 Dec 2011 10:03:11 -0800

    [ 
https://issues.apache.org/jira/browse/LUCENE-3638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13167149#comment-13167149
 ]


Robert Muir commented on LUCENE-3638:
-------------------------------------

{quote}
I don't mind much either. It's just that this sugar method suggests that you 
have to create a Set<String> on every call, while if we point people to DSFV, 
people will fins that they can pass String... too.
{quote}

True, but thats just because DSFV creates the hashset on the fly :)

{quote}
Perhaps if we omit the sugar method, people will think that way, and indeed 
create the object just once. Dunno, it's your call.
{quote}

Thats true too, because if you reuse the DSFV then the String... method is not 
harmful since you are only doing it once.
So I think the String... method is ok on DSFV for this reason.

However on indexreader, i think its also ok to have a sugar method with Set, 
because it just creates a DSFV around that hashset,
so its hardly wasteful. 

In other words: Create a Set<String> and reuse your own Set via the proposed 
sugar method, and I think its fine, 
and a lot friendlier. Its not hashing anything. Sure its creating a DSFV each 
time, but like using a DSFV in any way, 
its also creating a Document object each time. If you are really worried about 
this stuff, implement your own visitor 
and don't use Document at all :) Don't forget we are talking about stored 
fields too!

And I say keep the String... on DSFV only, but don't add to IR, so we don't 
encourage lots of wasteful rehashing.

                
> IndexReader.document always return a doc with all the stored fields loaded. 
> And this can be slow for the indexed document contain huge fields
> ---------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-3638
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3638
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/index, core/search
>    Affects Versions: 4.0
>         Environment: 64bit linux java 1.6
>            Reporter: peter chang
>            Priority: Minor
>              Labels: patch
>             Fix For: 4.0
>
>         Attachments: doc.fields.patch
>
>
> when generating digest for some documents with huge fields, it should be 
> unnecessary to load the field but just interesting part of the field with the 
> offset information. but indexreader always return the whole field content. 
> afterward, the customized storedfieldsreader will got a repeated loading

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (LUCENE-3638) IndexReader.document always return a doc with all the stored fields loaded. And this can be slow for the indexed document contain huge fields

Reply via email to