Hello Joshua , I am not sure which variable you are referring to on the memory settings in the config file , please paste the comment and config. I usually change the config from init.d script.
Best approach would be to bulk index say 10,000 feeds in sync mode , wait until is everything is indexed and then proceed to the next batch. I am not sure about the java API , but long back i used to curl to this stats API and see how much request was rejected. Thanks Vineeth On Tue, Sep 9, 2014 at 8:58 PM, Joshua P <jpetersen...@gmail.com> wrote: > You also said you wouldn't recommend indexing that much information at > once. How would you suggest breaking it up and what status should I look > for before doing another batch? I have to come up with some process that is > repeatable and mostly automated. > > On Tuesday, September 9, 2014 11:12:59 AM UTC-4, Joshua P wrote: >> >> Thanks for the reply, Vineeth! >> >> What's a practical heap size? I've seen some people saying they set it to >> 30gb but this confuses me because in the /etc/default/elasticsearch file, >> the comment suggests the max is only 1gb? >> >> I'll look into the threadpool issue. Is there a Java API for monitoring >> Cluster Node health? Can you point me at an example or give me a link to >> that? >> >> Thanks! >> >> On Tuesday, September 9, 2014 10:52:35 AM UTC-4, vineeth mohan wrote: >>> >>> Hello Joshuva , >>> >>> I have a feeling this has something to do with the threadpool. >>> There is a limit on number of feeds to be queued for indexing. >>> >>> Try increasing the size of threadpool queue of index and bulk to a large >>> number. >>> Also through cluster node API on threadpool, you can see if any request >>> has failed. >>> Monitor this API for any failed request due to large volume. >>> >>> Threadpool - http://www.elasticsearch.org/guide/en/elasticsearch/ >>> reference/current/modules-threadpool.html >>> Threadpool stats - http://www.elasticsearch.org/guide/en/elasticsearch/ >>> reference/current/cluster-nodes-stats.html >>> >>> Having said that , i wont recommend bulk indexing that much information >>> at a time and 512 MB is not going to help much. >>> >>> Thanks >>> Vineeth >>> >>> On Tue, Sep 9, 2014 at 7:48 PM, Joshua P <jpeter...@gmail.com> wrote: >>> >>>> Hi there! >>>> >>>> I'm trying to do a one-time index of about 800,000 records into an >>>> instance of elasticsearch. But I'm having a bit of trouble. It continually >>>> fails around 200,000 records. Looking at in the Elasticsearch Head Plugin, >>>> my index goes offline and becomes unrecoverable. >>>> >>>> For now, I have it running on a VM on my personal machine. >>>> >>>> VM Config: >>>> Ubuntu Server 14.04 64-Bit >>>> 8 GB RAM >>>> 2 Processors >>>> 32 GB SSD >>>> >>>> Java >>>> java version "1.7.0_65" >>>> OpenJDK Runtime Environment (IcedTea 2.5.1) >>>> (7u65-2.5.1-4ubuntu1~0.14.04.2) >>>> OpenJDK 64-Bit Server VM (build 24.65-b04, mixed mode) >>>> >>>> Elasticsearch is using mostly the defaults. This is the output of: >>>> curl http://localhost:9200/_nodes/process?pretty >>>> { >>>> "cluster_name" : "property_transaction_data", >>>> "nodes" : { >>>> "KlFkO_qgSOKmV_jjj5xeVw" : { >>>> "name" : "Marvin Flumm", >>>> "transport_address" : "inet[/192.168.133.131:9300]", >>>> "host" : "ubuntu-es", >>>> "ip" : "127.0.1.1", >>>> "version" : "1.3.2", >>>> "build" : "dee175d", >>>> "http_address" : "inet[/192.168.133.131:9200]", >>>> "process" : { >>>> "refresh_interval_in_millis" : 1000, >>>> "id" : 1092, >>>> "max_file_descriptors" : 65535, >>>> "mlockall" : true >>>> } >>>> } >>>> } >>>> } >>>> >>>> I adjusted ES_HEAP_SIZE to 512mb. >>>> >>>> I'm using the following code to pull data from SQL Server and index it. >>>> >>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "elasticsearch" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to elasticsearc...@googlegroups.com. >>>> To view this discussion on the web visit https://groups.google.com/d/ >>>> msgid/elasticsearch/f94f96d4-8c3f-462f-bdcf-df717cbc6269% >>>> 40googlegroups.com >>>> <https://groups.google.com/d/msgid/elasticsearch/f94f96d4-8c3f-462f-bdcf-df717cbc6269%40googlegroups.com?utm_medium=email&utm_source=footer> >>>> . >>>> For more options, visit https://groups.google.com/d/optout. >>>> >>> >>> -- > You received this message because you are subscribed to the Google Groups > "elasticsearch" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to elasticsearch+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/elasticsearch/0dcac495-a071-4644-9349-109071fb1855%40googlegroups.com > <https://groups.google.com/d/msgid/elasticsearch/0dcac495-a071-4644-9349-109071fb1855%40googlegroups.com?utm_medium=email&utm_source=footer> > . > > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAGdPd5myvEj22pDn%3DetpS1gL-6cwthg2Cv6m_omy6_fe2YFFgw%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.