Toke: I think part of it is locality. By that I mean two docValues fields in the same document have no relation to each other in terms of their location on disk. So _assuming_ all your DocValues can't be contained in memory, you may be doing a bunch of disk seeks.
This as opposed to just storing the fields which implies one disk seek/decompression for all fields for a given doc (assuming the 16K block read/decompressed holds all the fields). And maybe part of it is the notion of stuffing large text fields into a DocValues field just to return it seems like abusing DV. That said, the Streaming code uses DV fields exclusively and I got 200K rows/second returned without tuning a single thing which I doubt you're going to get with stored fields! So I think as usual, "it depends". On Mon, Sep 24, 2018 at 10:25 AM Toke Eskildsen <t...@kb.dk> wrote: > > David Smiley <david.w.smi...@gmail.com> wrote: > > I don't think it makes a difference if some people think docValues should > > never be used for value-retrieval. When that performance drop occurred > > due to those changes, I'm sure it would have affected sorting & faceting > > as well as value-retrieval. Some more than others perhaps. > > Yes. The iterative API is fine for relatively small jumps, so it works > perfectly for sorting on medium- to large result sets. Depending on the type > of faceting it's the same. Grouping and faceting on small result sets is > (probably) relatively affected, but as the amount of needed data is small in > those cases, the (assumed) impact is not that high. > > Retrieving documents is different as there are typically more fields involved > and the amount of documents itself is nearly always small, which means large > jumps repeated for all the fields. > > > I don't see any disagreement about improving docValues in the ways > > you suggest. > > You are right about that. I apologize if I was being unclear: It is not the > concrete patch I am asking about, that's just how this started. I am asking > for background on why it is considered misuse to use Doc Values for document > retrieval. > > - Toke Eskildsen > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org