Just reran the indexer and found this error coming up. I'm running out of disk space on the partition ES wants to write to.
F38KqHhnRDWtiJCss5Wz0g -- INTERNAL_SERVER_ERROR -- TranslogException[[index_type][0] Failed to write operation [org.elasticsearch.index.translog.Translog$Create@6f1f6b1e]]; nested: IOException[No space left on device]; -- index_type Where would I change the write location? Which config file? On Tuesday, September 9, 2014 1:28:21 PM UTC-4, Joshua P wrote: > > Hi Jörg, > > Can you elaborate on what you mean by I still need more fine tuning? > > I've upped the heap size to 4g (in both places I mentioned before because > it's not clear to me which one ES actually uses). I haven't tried to index > again yet. > Other than throttling my indexing, what are some other things I need to be > thinking about? > > On Tuesday, September 9, 2014 12:53:35 PM UTC-4, Jörg Prante wrote: >> >> Let ES_HEAP_SIZE at least to 1 GB, for smaller heaps like 512m and >> indexing around 1 million docs, you need some more fine tuning, which is >> complicated. Your machine is ok to set the heap to 4 GB which is 50% of 8 >> GB RAM. >> >> Jörg >> >> On Tue, Sep 9, 2014 at 5:39 PM, Joshua P <jpeter...@gmail.com> wrote: >> >>> Here is /etc/default/elasticsearch >>> >>> # Run Elasticsearch as this user ID and group ID >>> #ES_USER=elasticsearch >>> #ES_GROUP=elasticsearch >>> >>> # Heap Size (defaults to 256m min, 1g max) >>> ES_HEAP_SIZE=512m >>> >>> # Heap new generation >>> #ES_HEAP_NEWSIZE= >>> >>> # max direct memory >>> #ES_DIRECT_SIZE= >>> >>> # Maximum number of open files, defaults to 65535. >>> MAX_OPEN_FILES=65535 >>> >>> # Maximum locked memory size. Set to "unlimited" if you use the >>> # bootstrap.mlockall option in elasticsearch.yml. You must also set >>> # ES_HEAP_SIZE. >>> MAX_LOCKED_MEMORY=unlimited >>> >>> # Maximum number of VMA (Virtual Memory Areas) a process can own >>> #MAX_MAP_COUNT=262144 >>> >>> # Elasticsearch log directory >>> #LOG_DIR=/var/log/elasticsearch >>> >>> # Elasticsearch data directory >>> #DATA_DIR=/var/lib/elasticsearch >>> >>> # Elasticsearch work directory >>> #WORK_DIR=/tmp/elasticsearch >>> >>> # Elasticsearch configuration directory >>> #CONF_DIR=/etc/elasticsearch >>> >>> # Elasticsearch configuration file (elasticsearch.yml) >>> #CONF_FILE=/etc/elasticsearch/elasticsearch.yml >>> >>> # Additional Java OPTS >>> #ES_JAVA_OPTS= >>> >>> # Configure restart on package upgrade (true, every other setting will >>> lead to not restarting) >>> #RESTART_ON_UPGRADE=true >>> >>> I also see the same setting in /etc/init.d/elasticsearch. Do you know >>> which file takes priority? And what a good size would be? >>> >>> On Tuesday, September 9, 2014 11:32:19 AM UTC-4, vineeth mohan wrote: >>>> >>>> Hello Joshua , >>>> >>>> I am not sure which variable you are referring to on the memory >>>> settings in the config file , please paste the comment and config. >>>> I usually change the config from init.d script. >>>> >>>> Best approach would be to bulk index say 10,000 feeds in sync mode , >>>> wait until is everything is indexed and then proceed to the next batch. >>>> I am not sure about the java API , but long back i used to curl to this >>>> stats API and see how much request was rejected. >>>> >>>> Thanks >>>> Vineeth >>>> >>>> On Tue, Sep 9, 2014 at 8:58 PM, Joshua P <jpeter...@gmail.com> wrote: >>>> >>>>> You also said you wouldn't recommend indexing that much information at >>>>> once. How would you suggest breaking it up and what status should I look >>>>> for before doing another batch? I have to come up with some process that >>>>> is >>>>> repeatable and mostly automated. >>>>> >>>>> On Tuesday, September 9, 2014 11:12:59 AM UTC-4, Joshua P wrote: >>>>>> >>>>>> Thanks for the reply, Vineeth! >>>>>> >>>>>> What's a practical heap size? I've seen some people saying they set >>>>>> it to 30gb but this confuses me because in the >>>>>> /etc/default/elasticsearch >>>>>> file, the comment suggests the max is only 1gb? >>>>>> >>>>>> I'll look into the threadpool issue. Is there a Java API for >>>>>> monitoring Cluster Node health? Can you point me at an example or give >>>>>> me a >>>>>> link to that? >>>>>> >>>>>> Thanks! >>>>>> >>>>>> On Tuesday, September 9, 2014 10:52:35 AM UTC-4, vineeth mohan wrote: >>>>>>> >>>>>>> Hello Joshuva , >>>>>>> >>>>>>> I have a feeling this has something to do with the threadpool. >>>>>>> There is a limit on number of feeds to be queued for indexing. >>>>>>> >>>>>>> Try increasing the size of threadpool queue of index and bulk to a >>>>>>> large number. >>>>>>> Also through cluster node API on threadpool, you can see if any >>>>>>> request has failed. >>>>>>> Monitor this API for any failed request due to large volume. >>>>>>> >>>>>>> Threadpool - http://www.elasticsearch.org/guide/en/elasticsearch/ >>>>>>> reference/current/modules-threadpool.html >>>>>>> Threadpool stats - http://www.elasticsearch.org >>>>>>> /guide/en/elasticsearch/reference/current/cluster-nodes-stats.html >>>>>>> >>>>>>> Having said that , i wont recommend bulk indexing that much >>>>>>> information at a time and 512 MB is not going to help much. >>>>>>> >>>>>>> Thanks >>>>>>> Vineeth >>>>>>> >>>>>>> On Tue, Sep 9, 2014 at 7:48 PM, Joshua P <jpeter...@gmail.com> >>>>>>> wrote: >>>>>>> >>>>>>>> Hi there! >>>>>>>> >>>>>>>> I'm trying to do a one-time index of about 800,000 records into an >>>>>>>> instance of elasticsearch. But I'm having a bit of trouble. It >>>>>>>> continually >>>>>>>> fails around 200,000 records. Looking at in the Elasticsearch Head >>>>>>>> Plugin, >>>>>>>> my index goes offline and becomes unrecoverable. >>>>>>>> >>>>>>>> For now, I have it running on a VM on my personal machine. >>>>>>>> >>>>>>>> VM Config: >>>>>>>> Ubuntu Server 14.04 64-Bit >>>>>>>> 8 GB RAM >>>>>>>> 2 Processors >>>>>>>> 32 GB SSD >>>>>>>> >>>>>>>> Java >>>>>>>> java version "1.7.0_65" >>>>>>>> OpenJDK Runtime Environment (IcedTea 2.5.1) >>>>>>>> (7u65-2.5.1-4ubuntu1~0.14.04.2) >>>>>>>> OpenJDK 64-Bit Server VM (build 24.65-b04, mixed mode) >>>>>>>> >>>>>>>> Elasticsearch is using mostly the defaults. This is the output of: >>>>>>>> curl http://localhost:9200/_nodes/process?pretty >>>>>>>> { >>>>>>>> "cluster_name" : "property_transaction_data", >>>>>>>> "nodes" : { >>>>>>>> "KlFkO_qgSOKmV_jjj5xeVw" : { >>>>>>>> "name" : "Marvin Flumm", >>>>>>>> "transport_address" : "inet[/192.168.133.131:9300]", >>>>>>>> "host" : "ubuntu-es", >>>>>>>> "ip" : "127.0.1.1", >>>>>>>> "version" : "1.3.2", >>>>>>>> "build" : "dee175d", >>>>>>>> "http_address" : "inet[/192.168.133.131:9200]", >>>>>>>> "process" : { >>>>>>>> "refresh_interval_in_millis" : 1000, >>>>>>>> "id" : 1092, >>>>>>>> "max_file_descriptors" : 65535, >>>>>>>> "mlockall" : true >>>>>>>> } >>>>>>>> } >>>>>>>> } >>>>>>>> } >>>>>>>> >>>>>>>> I adjusted ES_HEAP_SIZE to 512mb. >>>>>>>> >>>>>>>> I'm using the following code to pull data from SQL Server and index >>>>>>>> it. >>>>>>>> >>>>>>>> -- >>>>>>>> You received this message because you are subscribed to the Google >>>>>>>> Groups "elasticsearch" group. >>>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>>> send an email to elasticsearc...@googlegroups.com. >>>>>>>> To view this discussion on the web visit >>>>>>>> https://groups.google.com/d/msgid/elasticsearch/f94f96d4-8c3 >>>>>>>> f-462f-bdcf-df717cbc6269%40googlegroups.com >>>>>>>> <https://groups.google.com/d/msgid/elasticsearch/f94f96d4-8c3f-462f-bdcf-df717cbc6269%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>>>> . >>>>>>>> For more options, visit https://groups.google.com/d/optout. >>>>>>>> >>>>>>> >>>>>>> -- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "elasticsearch" group. >>>>> To unsubscribe from this group and stop receiving emails from it, send >>>>> an email to elasticsearc...@googlegroups.com. >>>>> To view this discussion on the web visit https://groups.google.com/d/ >>>>> msgid/elasticsearch/0dcac495-a071-4644-9349-109071fb1855% >>>>> 40googlegroups.com >>>>> <https://groups.google.com/d/msgid/elasticsearch/0dcac495-a071-4644-9349-109071fb1855%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>> . >>>>> >>>>> For more options, visit https://groups.google.com/d/optout. >>>>> >>>> >>>> -- >>> You received this message because you are subscribed to the Google >>> Groups "elasticsearch" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to elasticsearc...@googlegroups.com. >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/elasticsearch/b439af3d-69b0-4301-bf07-22b37767a17c%40googlegroups.com >>> >>> <https://groups.google.com/d/msgid/elasticsearch/b439af3d-69b0-4301-bf07-22b37767a17c%40googlegroups.com?utm_medium=email&utm_source=footer> >>> . >>> >>> For more options, visit https://groups.google.com/d/optout. >>> >> >> -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/1765489f-d2f5-47c5-a499-9633c9be54e2%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.