Thank you for the feedback guys, it is greatly appreciated. I had not thought about file descriptors so that gives me another thing to think about.
Our daily volume will be pretty high across all of our users, I don't think we have a great estimate, but right now we are at about 50 million documents a day and ~30 users. Our cluster is in EC2 so we can adjust size and nodes basically whenever we need to, so I don't think that is a huge issue assuming we get the index layout correct. At present we are using one large index which is causing some performance issues as you would expect. We also did not get our sharding correct originally so now we have really large shards. We have a requirement to keep 90 days of data per user. With an upward bound of users at (indeterminate though) 1000 users. So that would be 90,000 indexes if we did it by day. I guess I am wondering if that is a crazy thing to attempt to do, or if it makes more sense to break it up weekly or monthly instead in order to keep the index count down. Our documents are usually pretty small (or what I would consider small) at <= 1K, but we will receive them basically constantly. So I guess I am looking for tips on how we can layout and breakup indexes to get the best performance benefit as we grow. Again thank you for the feedback. And appreciate anymore in advance! Thanks, On Thursday, October 9, 2014 1:18:07 PM UTC-4, gugod wrote: > > > Mark Walkom writes: > > > Did you get better writes? > > What sort of storage are you on, did you measure before and after, are > you > > reaching I/O limits? > > We pump realtime log data and only measure the the overall processing > throughput instead of low level IO throughput (we had the data, but we > did not correlate those data with setting change). The disk is just some > hard drive but not SSD or some hybrid disk. We did not reach disk or > network IO limit before and afterwards. FD limits was the only limit we > ran into. > > -- > Cheers, > Kang-min Liu > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/6a0f4b11-2061-4972-b475-1f04aaf66bbc%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.