Re: bulk indexing and number of shards

2014-04-14 Thread joergpra...@gmail.com
Although many shards mean higher concurrency, a single shard also has very high concurrency. ES concurrency is implemented independent of the number of shards and is very flexible in configuration of thread pools and modules. The default settings are for a small ES system which can scale to a arou

Re: bulk indexing and number of shards

2014-04-14 Thread eunever32
Thanks guys, yes I already have refresh interval at -1 What I'm suggesting is that to support multiple client threads say : 50 then it seems that 50 shards is a big help. ie more shards equals more concurrency. Thanks. -- You received this message because you are subscribed to the Google Gro

Re: bulk indexing and number of shards

2014-04-14 Thread joergpra...@gmail.com
More thoughts in addition to Nik: - the default setting is refresh by every second. Refresh works very fast when segments are small. If you have more than one shard and use bulk indexing, the segments are small enough for refresh for a longer time. So you will observe a faster bulk indexing, but o

Re: bulk indexing and number of shards

2014-04-13 Thread Nik Everett
Sorry, you can't reduce it. I imagine the performance increase you get is because the merge logic is per shard so it does less when there are more shards for the same data. You can likely get similar numbers if you set the refresh interval to -1 and play with the merge policy before the bulk loa

bulk indexing and number of shards

2014-04-12 Thread eunever32
Hi, I'm testing on a single node. I find I can get better bulk indexing performance when the index has more shards. Does that make sense ? My own theory is that when I have multiple bulk clients, then by increasing shards the server achieves better concurrency (?) So if I increase the shards