Also - does "fielddata": { "loading": "eager" } makes sense with doc_values in this use case? Would that combination be supported in the future?
On Tuesday, April 21, 2015 at 2:14:03 AM UTC+3, Itai Frenkel wrote: > > Itamar, > > 1. The _source field includes many fields that are only being indexed, and > many fields that are only needed as a query search result. _source includes > them both.The projection from _source from the query result is too CPU > intensive to do during search time for each result, especially if the size > is big. > 2. I agree that adding another NoSQL could solve this problem, however it > is currently out of scope, as it would require syncing data with another > data store. > 3. Wouldn't a big stored field will bloat the lucene index size? Even if > not, isn't non_analyzed fields are destined to be (or already are) > doc_fields? > > On Tuesday, April 21, 2015 at 1:36:20 AM UTC+3, Itamar Syn-Hershko wrote: >> >> This is how _source works. doc_values don't make sense in this regard - >> what you are looking for is using stored fields and have the transform >> script write to that. Loading stored fields (even one field per hit) may be >> slower than loading and parsing _source, though. >> >> I'd just put this logic in the indexer, though. It will definitely help >> with other things as well, such as nasty huge mappings. >> >> Alternatively, find a way to avoid IO completely. How about using ES for >> search and something like riak for loading the actual data, if IO costs are >> so noticable? >> >> -- >> >> Itamar Syn-Hershko >> http://code972.com | @synhershko <https://twitter.com/synhershko> >> Freelance Developer & Consultant >> Lucene.NET committer and PMC member >> >> On Mon, Apr 20, 2015 at 11:18 PM, Itai Frenkel <itaif...@live.com> wrote: >> >>> Hi, >>> >>> We are having a performance problem in which for each hit, elasticsearch >>> parses the entire _source then generates a new Json with only the requested >>> query _source fields. In order to overcome this issue we would like to use >>> mapping transform script that serializes the requested query fields (which >>> is known in advance) into a doc_value. Does that makes sense? >>> >>> The actual problem with the transform script is SecurityException that >>> does not allow using any json serialization mechanism. A binary >>> serialization would also be ok. >>> >>> >>> Itai >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "elasticsearch" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to elasticsearc...@googlegroups.com. >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/elasticsearch/b897aba2-c250-4474-a03f-1d2a993baef9%40googlegroups.com >>> >>> <https://groups.google.com/d/msgid/elasticsearch/b897aba2-c250-4474-a03f-1d2a993baef9%40googlegroups.com?utm_medium=email&utm_source=footer> >>> . >>> For more options, visit https://groups.google.com/d/optout. >>> >> >> -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/d5abaeac-ff16-45ac-bb3d-62b53e497795%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.