Re: Bulk indexing slow down when data amount increase

2014-01-14 Thread joergpra...@gmail.com
replica = 0 reduces the indexing workload to only the required shards, no duplicate indexing occurs. Do not forget to increase replica level after bulk has completed. queue = 50 instructs a node to reject bulk requests when more than 50 bulk requests per node are active. This saves a node from bei

Re: Bulk indexing slow down when data amount increase

2014-01-14 Thread Eric Lu
I have set the replica to 0 and queue to 50. and it can index about 7 - 8 millions documents per hour now. It's acceptable . Though i don't know which change makes it. Thank you all. 在 2014年1月13日星期一UTC+8下午9时04分35秒,Eric Lu写道: > > I observed the GC occured once every 15 seconds when heap mem was

Re: Bulk indexing slow down when data amount increase

2014-01-13 Thread Karol Gwaj
did you tried any of elasticseach health monitoring plugins for example 'ElasticSearch HQ' have 'Node Diagnostics' option that will point weak points of your cluster and will suggest possible solution (very useful if you just starting your adventure with elasticsearch) also 'bigdesk' is very good

Re: Bulk indexing slow down when data amount increase

2014-01-13 Thread Eric Lu
I observed the GC occured once every 15 seconds when heap mem was 75% of the heap size. Is it too frequent? there is no OOMs. I set refresh interval to 30s. I'll try to use a smaller queue and set replica to 0 Thank you. 在 2014年1月13日星期一UTC+8下午8时42分56秒,Jörg Prante写道: > > 12 hours is an absurd

Re: Bulk indexing slow down when data amount increase

2014-01-13 Thread joergpra...@gmail.com
12 hours is an absurdly long time for indexing 10 million docs. queue:1000 is much too high for production. For test it may be ok (it effectively disables queue rejections) but on production, you play with the risk of starving your cluster resources. Do you rmonitor the resource usage of ES, espe

Bulk indexing slow down when data amount increase

2014-01-13 Thread Eric Lu
Hi, guys I'm using elasticsearch to index a large number of documents. A document is about 0.5KB. My elasticsearch cluster has 5 nodes(all data nodes). Each nodes are running oracle Java version: 1.7.0_13 and both have 16GB RAM with 8GB allocated to the JVM. And the index has 50 shards and 1 r