Indexing performance with doc values (particularly with larger number of fields)

Alex at Ikanow Sun, 23 Mar 2014 19:01:23 -0700

This might be more of a Lucene question, but a quick google didn't throw up 
anything.


Has anyone done/seen any benchmarking on indexing performance (overhead) 
due to using doc values?

I often index quite large JSON objects, with many fields (eg 50), I'm 
trying to get a feel for whether I can just let all of them be doc values 
on the off chance I'll want to aggregate over them, or whether I need to 
pick beforehand which fields will support aggregation.

(A related question: presumably allowing a mix of doc values fields and 
"legacy" fields is a bad idea, because if you use doc values fields you 
want a low max heap so that the file cache has lots of memory available, 
whereas if you use the field cache you need a large heap - is that about 
right, or am i missing something?)

Thanks for any insight!

Alex
Ikanow

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/0361eda4-ab39-4536-b91a-ccb710921edd%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Indexing performance with doc values (particularly with larger number of fields)

Reply via email to