Hi - I recently experienced some surprising elasticsearch behavior and I'd 
appreciate some verification on the "whys" behind what we saw. Basically, 
during a cluster restart we lost some index metadata causing those indices 
to not be realized and loaded from the data nodes (raw index files still 
existed on disk), then, before we realized that and had a chance to recover 
them, new incoming data caused the cluster to create new indices under the 
same names, completely overwriting the original, raw index data on disk 
(clearing out and losing a lot of data). If that's unclear or for further 
details, I've posted the scenario and straightforward steps to reproduce: 
https://github.com/dpb587/elasticsearch-lost-index.

These are my core questions...

1. Is it true that index metadata (sharding size, mapping, etc) will only 
ever be stored on master-capable nodes? Previously, my understanding of the 
master was that it was primarily responsible for managing cluster state and 
coordinating cluster balancing, not persisting index metadata. (I'm not 
arguing it doesn't necessarily make sense, just that I didn't realize 
"cluster state" included the index metadata)

2. Is there documentation on elasticsearch.org which more precisely defines 
the responsibilities of master and data nodes? The only vague references 
I've come across are 
http://www.elasticsearch.org/guide/en/elasticsearch/reference/1.x/modules-node.html,
 
the elasticsearch default configuration file, and various non-authoritative 
blog posts and Stack Overflow answers, none of which prompted me to realize 
data nodes would not hold their own metadata.

3. Is it true that elasticsearch (Lucene?) will overwrite existing data 
files without error or warning if the cluster is not aware of the index? If 
so, is there a way to disable that behavior to avoid accidental data loss 
due to misconfiguration (aside from the broad `action.auto_create_index` 
setting)? If not, is there anything else which would explain the behavior 
we saw?

Thank you for your time!

Danny

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/9407e415-db8f-461d-b04f-027fda4f5c9c%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Reply via email to