Hi, You have to calculate the volumes you will keep in one shard first then you have to break your volumes into the number of shards you will maintain and then scale accordingly into a number of nodes, or at least as your volumes grow you should grow your cluster as well.
It is difficult to predict what problems may arise it is too generic your case, what will be the usage of the cluster? what queries you will perform, you will mostly do indexing and occasionally querying or you will intensively query your data. Most important you need to think how you will partition your data, will you have one index, multiple index like a logstash approach? or not Maybe check here: https://www.found.no/foundation/sizing-elasticsearch/ For data more than a year what you will do delete them? Do you afford to lose data? Will you keep backups? IMHO, these are some of the questions you must answer in order to see whether such an approach suit your needs. It is hardware, structure and partitioning of your data. Thomas On Wednesday, 17 September 2014 13:41:55 UTC+3, P Suman wrote: > > Hello, > > We are planning to use ES as a primary datastore. > > Here is my usecase > > We receive a million transactions per day (all are inserts). > Each transaction is around 500KB size, transaction has 10 fields we should > be able to search on all 10 fields. > We want to keep around 1 yr worth of data, this comes around 180TB > > Can you please let me know any problems that might arise if i use elastic > search as the primary datastore. > > > > Regards, > Suman > > > > > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/0612d5d3-05df-4538-a3f0-e87cd9b3dc49%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.