Thank you for the feedback guys, it is greatly appreciated.
I had not thought about file descriptors so that gives me another thing to 
think about.

Our daily volume will be pretty high across all of our users, I don't think 
we have a great estimate, but right now we are at about 50 million 
documents a day and ~30 users.
Our cluster is in EC2 so we can adjust size and nodes basically whenever we 
need to, so I don't think that is a huge issue assuming we get the index 
layout correct.
At present we are using one large index which is causing some performance 
issues as you would expect. We also did not get our sharding correct 
originally so now we have really large shards.

We have a requirement to keep 90 days of data per user. With an upward 
bound of users at (indeterminate though) 1000 users. So that would be 
90,000 indexes if we did it by day.
I guess I am wondering if that is a crazy thing to attempt to do, or if it 
makes more sense to break it up weekly or monthly instead in order to keep 
the index count down.

Our documents are usually pretty small (or what I would consider small) at 
<= 1K, but we will receive them basically constantly.
So I guess I am looking for tips on how we can layout and breakup indexes 
to get the best performance benefit as we grow.

Again thank you for the feedback. And appreciate anymore in advance!

Thanks,

On Thursday, October 9, 2014 1:18:07 PM UTC-4, gugod wrote:
>
>
> Mark Walkom writes: 
>
> > Did you get better writes? 
> > What sort of storage are you on, did you measure before and after, are 
> you 
> > reaching I/O limits? 
>
> We pump realtime log data and only measure the the overall processing 
> throughput instead of low level IO throughput (we had the data, but we 
> did not correlate those data with setting change). The disk is just some 
> hard drive but not SSD or some hybrid disk. We did not reach disk or 
> network IO limit before and afterwards. FD limits was the only limit we 
> ran into. 
>
> -- 
> Cheers, 
> Kang-min Liu 
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/6a0f4b11-2061-4972-b475-1f04aaf66bbc%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to