Hi Ashish, On Thu, Aug 14, 2014 at 12:35 AM, Ashish Mishra <laughingbud...@gmail.com> wrote:
> That sounds possible. We are using spindle disks. I have ~36Gb free for > the filesystem cache, and the previous data size (without the added field) > was 60-65Gb per node. So it's likely that >50% of queries were previously > addressed out of the FS cache, even more if queries are unevenly > distributed. > Data size is now 200Gb/node. So only ~18% of queries could hit the cache > and the rest would incur seek times. > > Hmm... given this knowledge, is there a way to mitigate the effect without > moving everything to SSD? Only a minority of queries return the stored > field and it is not indexed. Ideally, it would be stored in separate > (colocated) files from the indexed fields. That way, most queries would be > unaffected and only those returning the value incur the seek cost. > > I imagine indexes with _source enabled would see similar effects. > > Is a parent-child relationship a good way to achieve the scenario above? > The parent can contain indexed fields and the child has stored fields. > Not sure if this just introduces new problems. > I think that you don't even need parent/child relations for this. If you identify a few large stored fields that you rarely need, you could store them in a different index with the same _id and only GET them on demand. -- Adrien Grand -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j48QpGoV6Gh8ns5SzrABLFmZLMjWx6iEUGea2evx06kAg%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.