Check the path.data setting in config/elasticsearch.yml Jörg
On Tue, Sep 9, 2014 at 7:50 PM, Joshua P <jpetersen...@gmail.com> wrote: > Just reran the indexer and found this error coming up. I'm running out of > disk space on the partition ES wants to write to. > > F38KqHhnRDWtiJCss5Wz0g -- INTERNAL_SERVER_ERROR -- > TranslogException[[index_type][0] Failed to write operation > [org.elasticsearch.index.translog.Translog$Create@6f1f6b1e]]; nested: > IOException[No space left on device]; -- index_type > > Where would I change the write location? Which config file? > > On Tuesday, September 9, 2014 1:28:21 PM UTC-4, Joshua P wrote: >> >> Hi Jörg, >> >> Can you elaborate on what you mean by I still need more fine tuning? >> >> I've upped the heap size to 4g (in both places I mentioned before because >> it's not clear to me which one ES actually uses). I haven't tried to index >> again yet. >> Other than throttling my indexing, what are some other things I need to >> be thinking about? >> >> On Tuesday, September 9, 2014 12:53:35 PM UTC-4, Jörg Prante wrote: >>> >>> Let ES_HEAP_SIZE at least to 1 GB, for smaller heaps like 512m and >>> indexing around 1 million docs, you need some more fine tuning, which is >>> complicated. Your machine is ok to set the heap to 4 GB which is 50% of 8 >>> GB RAM. >>> >>> Jörg >>> >>> On Tue, Sep 9, 2014 at 5:39 PM, Joshua P <jpeter...@gmail.com> wrote: >>> >>>> Here is /etc/default/elasticsearch >>>> >>>> # Run Elasticsearch as this user ID and group ID >>>> #ES_USER=elasticsearch >>>> #ES_GROUP=elasticsearch >>>> >>>> # Heap Size (defaults to 256m min, 1g max) >>>> ES_HEAP_SIZE=512m >>>> >>>> # Heap new generation >>>> #ES_HEAP_NEWSIZE= >>>> >>>> # max direct memory >>>> #ES_DIRECT_SIZE= >>>> >>>> # Maximum number of open files, defaults to 65535. >>>> MAX_OPEN_FILES=65535 >>>> >>>> # Maximum locked memory size. Set to "unlimited" if you use the >>>> # bootstrap.mlockall option in elasticsearch.yml. You must also set >>>> # ES_HEAP_SIZE. >>>> MAX_LOCKED_MEMORY=unlimited >>>> >>>> # Maximum number of VMA (Virtual Memory Areas) a process can own >>>> #MAX_MAP_COUNT=262144 >>>> >>>> # Elasticsearch log directory >>>> #LOG_DIR=/var/log/elasticsearch >>>> >>>> # Elasticsearch data directory >>>> #DATA_DIR=/var/lib/elasticsearch >>>> >>>> # Elasticsearch work directory >>>> #WORK_DIR=/tmp/elasticsearch >>>> >>>> # Elasticsearch configuration directory >>>> #CONF_DIR=/etc/elasticsearch >>>> >>>> # Elasticsearch configuration file (elasticsearch.yml) >>>> #CONF_FILE=/etc/elasticsearch/elasticsearch.yml >>>> >>>> # Additional Java OPTS >>>> #ES_JAVA_OPTS= >>>> >>>> # Configure restart on package upgrade (true, every other setting will >>>> lead to not restarting) >>>> #RESTART_ON_UPGRADE=true >>>> >>>> I also see the same setting in /etc/init.d/elasticsearch. Do you know >>>> which file takes priority? And what a good size would be? >>>> >>>> On Tuesday, September 9, 2014 11:32:19 AM UTC-4, vineeth mohan wrote: >>>>> >>>>> Hello Joshua , >>>>> >>>>> I am not sure which variable you are referring to on the memory >>>>> settings in the config file , please paste the comment and config. >>>>> I usually change the config from init.d script. >>>>> >>>>> Best approach would be to bulk index say 10,000 feeds in sync mode , >>>>> wait until is everything is indexed and then proceed to the next batch. >>>>> I am not sure about the java API , but long back i used to curl to >>>>> this stats API and see how much request was rejected. >>>>> >>>>> Thanks >>>>> Vineeth >>>>> >>>>> On Tue, Sep 9, 2014 at 8:58 PM, Joshua P <jpeter...@gmail.com> wrote: >>>>> >>>>>> You also said you wouldn't recommend indexing that much information >>>>>> at once. How would you suggest breaking it up and what status should I >>>>>> look >>>>>> for before doing another batch? I have to come up with some process that >>>>>> is >>>>>> repeatable and mostly automated. >>>>>> >>>>>> On Tuesday, September 9, 2014 11:12:59 AM UTC-4, Joshua P wrote: >>>>>>> >>>>>>> Thanks for the reply, Vineeth! >>>>>>> >>>>>>> What's a practical heap size? I've seen some people saying they set >>>>>>> it to 30gb but this confuses me because in the >>>>>>> /etc/default/elasticsearch >>>>>>> file, the comment suggests the max is only 1gb? >>>>>>> >>>>>>> I'll look into the threadpool issue. Is there a Java API for >>>>>>> monitoring Cluster Node health? Can you point me at an example or give >>>>>>> me a >>>>>>> link to that? >>>>>>> >>>>>>> Thanks! >>>>>>> >>>>>>> On Tuesday, September 9, 2014 10:52:35 AM UTC-4, vineeth mohan wrote: >>>>>>>> >>>>>>>> Hello Joshuva , >>>>>>>> >>>>>>>> I have a feeling this has something to do with the threadpool. >>>>>>>> There is a limit on number of feeds to be queued for indexing. >>>>>>>> >>>>>>>> Try increasing the size of threadpool queue of index and bulk to a >>>>>>>> large number. >>>>>>>> Also through cluster node API on threadpool, you can see if any >>>>>>>> request has failed. >>>>>>>> Monitor this API for any failed request due to large volume. >>>>>>>> >>>>>>>> Threadpool - http://www.elasticsearch.org/guide/en/elasticsearch/ >>>>>>>> reference/current/modules-threadpool.html >>>>>>>> Threadpool stats - http://www.elasticsearch.org >>>>>>>> /guide/en/elasticsearch/reference/current/cluster-nodes-stats.html >>>>>>>> >>>>>>>> Having said that , i wont recommend bulk indexing that much >>>>>>>> information at a time and 512 MB is not going to help much. >>>>>>>> >>>>>>>> Thanks >>>>>>>> Vineeth >>>>>>>> >>>>>>>> On Tue, Sep 9, 2014 at 7:48 PM, Joshua P <jpeter...@gmail.com> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Hi there! >>>>>>>>> >>>>>>>>> I'm trying to do a one-time index of about 800,000 records into an >>>>>>>>> instance of elasticsearch. But I'm having a bit of trouble. It >>>>>>>>> continually >>>>>>>>> fails around 200,000 records. Looking at in the Elasticsearch Head >>>>>>>>> Plugin, >>>>>>>>> my index goes offline and becomes unrecoverable. >>>>>>>>> >>>>>>>>> For now, I have it running on a VM on my personal machine. >>>>>>>>> >>>>>>>>> VM Config: >>>>>>>>> Ubuntu Server 14.04 64-Bit >>>>>>>>> 8 GB RAM >>>>>>>>> 2 Processors >>>>>>>>> 32 GB SSD >>>>>>>>> >>>>>>>>> Java >>>>>>>>> java version "1.7.0_65" >>>>>>>>> OpenJDK Runtime Environment (IcedTea 2.5.1) >>>>>>>>> (7u65-2.5.1-4ubuntu1~0.14.04.2) >>>>>>>>> OpenJDK 64-Bit Server VM (build 24.65-b04, mixed mode) >>>>>>>>> >>>>>>>>> Elasticsearch is using mostly the defaults. This is the output of: >>>>>>>>> curl http://localhost:9200/_nodes/process?pretty >>>>>>>>> { >>>>>>>>> "cluster_name" : "property_transaction_data", >>>>>>>>> "nodes" : { >>>>>>>>> "KlFkO_qgSOKmV_jjj5xeVw" : { >>>>>>>>> "name" : "Marvin Flumm", >>>>>>>>> "transport_address" : "inet[/192.168.133.131:9300]", >>>>>>>>> "host" : "ubuntu-es", >>>>>>>>> "ip" : "127.0.1.1", >>>>>>>>> "version" : "1.3.2", >>>>>>>>> "build" : "dee175d", >>>>>>>>> "http_address" : "inet[/192.168.133.131:9200]", >>>>>>>>> "process" : { >>>>>>>>> "refresh_interval_in_millis" : 1000, >>>>>>>>> "id" : 1092, >>>>>>>>> "max_file_descriptors" : 65535, >>>>>>>>> "mlockall" : true >>>>>>>>> } >>>>>>>>> } >>>>>>>>> } >>>>>>>>> } >>>>>>>>> >>>>>>>>> I adjusted ES_HEAP_SIZE to 512mb. >>>>>>>>> >>>>>>>>> I'm using the following code to pull data from SQL Server and >>>>>>>>> index it. >>>>>>>>> >>>>>>>>> -- >>>>>>>>> You received this message because you are subscribed to the Google >>>>>>>>> Groups "elasticsearch" group. >>>>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>>>> send an email to elasticsearc...@googlegroups.com. >>>>>>>>> To view this discussion on the web visit >>>>>>>>> https://groups.google.com/d/msgid/elasticsearch/f94f96d4-8c3 >>>>>>>>> f-462f-bdcf-df717cbc6269%40googlegroups.com >>>>>>>>> <https://groups.google.com/d/msgid/elasticsearch/f94f96d4-8c3f-462f-bdcf-df717cbc6269%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>>>>> . >>>>>>>>> For more options, visit https://groups.google.com/d/optout. >>>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>> You received this message because you are subscribed to the Google >>>>>> Groups "elasticsearch" group. >>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>> send an email to elasticsearc...@googlegroups.com. >>>>>> To view this discussion on the web visit https://groups.google.com/d/ >>>>>> msgid/elasticsearch/0dcac495-a071-4644-9349-109071fb1855%40goo >>>>>> glegroups.com >>>>>> <https://groups.google.com/d/msgid/elasticsearch/0dcac495-a071-4644-9349-109071fb1855%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>> . >>>>>> >>>>>> For more options, visit https://groups.google.com/d/optout. >>>>>> >>>>> >>>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "elasticsearch" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to elasticsearc...@googlegroups.com. >>>> To view this discussion on the web visit https://groups.google.com/d/ >>>> msgid/elasticsearch/b439af3d-69b0-4301-bf07-22b37767a17c% >>>> 40googlegroups.com >>>> <https://groups.google.com/d/msgid/elasticsearch/b439af3d-69b0-4301-bf07-22b37767a17c%40googlegroups.com?utm_medium=email&utm_source=footer> >>>> . >>>> >>>> For more options, visit https://groups.google.com/d/optout. >>>> >>> >>> -- > You received this message because you are subscribed to the Google Groups > "elasticsearch" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to elasticsearch+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/elasticsearch/1765489f-d2f5-47c5-a499-9633c9be54e2%40googlegroups.com > <https://groups.google.com/d/msgid/elasticsearch/1765489f-d2f5-47c5-a499-9633c9be54e2%40googlegroups.com?utm_medium=email&utm_source=footer> > . > > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoE-5CMJU6Tk72KcKgMcsat3phgXXfQS-qfFeU-YVbzodQ%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.