Itamar, 1. The _source field includes many fields that are only being indexed, and many fields that are only needed as a query search result. _source includes them both.The projection from _source from the query result is too CPU intensive to do during search time for each result, especially if the size is big. 2. I agree that adding another NoSQL could solve this problem, however it is currently out of scope, as it would require syncing data with another data store. 3. Wouldn't a big stored field will bloat the lucene index size? Even if not, isn't non_analyzed fields are destined to be (or already are) doc_fields?
On Tuesday, April 21, 2015 at 1:36:20 AM UTC+3, Itamar Syn-Hershko wrote: > > This is how _source works. doc_values don't make sense in this regard - > what you are looking for is using stored fields and have the transform > script write to that. Loading stored fields (even one field per hit) may be > slower than loading and parsing _source, though. > > I'd just put this logic in the indexer, though. It will definitely help > with other things as well, such as nasty huge mappings. > > Alternatively, find a way to avoid IO completely. How about using ES for > search and something like riak for loading the actual data, if IO costs are > so noticable? > > -- > > Itamar Syn-Hershko > http://code972.com | @synhershko <https://twitter.com/synhershko> > Freelance Developer & Consultant > Lucene.NET committer and PMC member > > On Mon, Apr 20, 2015 at 11:18 PM, Itai Frenkel <itaif...@live.com > <javascript:>> wrote: > >> Hi, >> >> We are having a performance problem in which for each hit, elasticsearch >> parses the entire _source then generates a new Json with only the requested >> query _source fields. In order to overcome this issue we would like to use >> mapping transform script that serializes the requested query fields (which >> is known in advance) into a doc_value. Does that makes sense? >> >> The actual problem with the transform script is SecurityException that >> does not allow using any json serialization mechanism. A binary >> serialization would also be ok. >> >> >> Itai >> >> -- >> You received this message because you are subscribed to the Google Groups >> "elasticsearch" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to elasticsearc...@googlegroups.com <javascript:>. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/elasticsearch/b897aba2-c250-4474-a03f-1d2a993baef9%40googlegroups.com >> >> <https://groups.google.com/d/msgid/elasticsearch/b897aba2-c250-4474-a03f-1d2a993baef9%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> For more options, visit https://groups.google.com/d/optout. >> > > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/630a2998-e2a9-44a3-9c93-e692be2c2338%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.