There is no current method to redirect indexing to a preparer index for delayed indexing, while searching is still enabled.
By using rivers, you can close the _river index, some rivers (not all) may take this as an indicator to stop indexing unless the _river index is reopened. I consider this as a workaround and not as a feature. >From my understanding the most preferred method to implement delayed indexing currently is to set up a durable message queue (like RabbitMQ and logstash) for external document persistency. By stopping/starting and reconfiguring the message queue, the data can be indexed wherever you like. If you like to see delayed indexing as a core feature in ES and not as a plugin, then you should open an issue with the suggestion. To be honest I assume this will be rejected in favor of a queue in front of ES, like described in this blog post http://dopey.io/logstash-rabbitmq-tuning.html Jörg On Tue, Nov 11, 2014 at 11:40 PM, Amish Asthana <asthanaam...@gmail.com> wrote: > Thanks Jorg, make sense. > Few minor questions : > a) With the current ES architecture is this the best/recommended way? > b) Is there any project in roadmap to provide more support for it. > > regards and thanks > amish > > On Tuesday, November 11, 2014 12:08:24 PM UTC-8, Jörg Prante wrote: >> >> FAST stored the source data in distributed machines, only the control API >> was not distributed (similar to ES HTTP curl requests, which also connect >> to one host only). >> >> Of course you could index raw JSON to a preparer index with a single >> field, _all disabled, and field set to "not indexed" so there is no Lucene >> activity on it. This preparer index could also hold mappings in special >> documents for the indexing runs. >> >> The data duplication factor depends on the complexity of the mapping(s), >> and the characteristics of the data (dictionary size, analyzer / tokenizer >> output, norms etc.) >> >> A plugin would do no magic at all, it could bundle the calls that >> otherwise a client would have to execute from remote, and adds some >> convenience commands for managing the prepare stage (e.g. suspend/resume) >> and showing the current state of indexing. >> >> If redundant data is a no-go, then the whole approach is counterintuitive. >> >> Jörg >> >> >> On Tue, Nov 11, 2014 at 7:46 PM, Amish Asthana <asthan...@gmail.com> >> wrote: >> >>> With existing Elastic Search I can think of an architecture like this. >>> >>> Index : indexForDataDump : No mapping(Is it possible?) or minimum >>> mapping. Use only to dump data from external system. There is some primary >>> key. >>> >>> There are different search indexes with different mapping : >>> search-index1, search-index2 etc. >>> These indexes get populated from the indexForDataDump using technique >>> mentioned here >>> <http://www.elasticsearch.org/blog/changing-mapping-with-zero-downtime/> >>> . >>> So this way I can drop the search index as desired and create new one >>> with new mapping. >>> Any pros/cons or issue with this approach? There will be data >>> duplication but I am hoping its minimum. ( Any way to quantify it?) >>> >>> regards and thanks >>> amish >>> >>> >>> On Tuesday, November 11, 2014 10:02:46 AM UTC-8, Amish Asthana wrote: >>>> >>>> I am not aware of FAST but the idea looks promising. >>>> However it might not be that easy to just have plugin for ES, as the >>>> data itself is distributed on different machines. >>>> So it will not be possible to have just one server with the data, as it >>>> will become single point of failure. >>>> regards and thanks >>>> amish >>>> >>>> On Tuesday, November 11, 2014 1:21:53 AM UTC-8, Jörg Prante wrote: >>>>> >>>>> I know from the FAST Search engine ten years ago there was a two-phase >>>>> commit for distributed search and indexing. One server could listen on the >>>>> API and keep the (compressed) input stored, and all the other indexing >>>>> servers were supplied by this input in another phase to create binary >>>>> indexes, either automatically, or by manual operation, called >>>>> "suspend/resume indexing API". >>>>> >>>>> The advantage was that data could be received permanently via API >>>>> while FAST indexing could be stopped temporarily in order to balance >>>>> between indexing and search performance on limited hardware. >>>>> >>>>> Do you think of something like that also for Elasticsearch? This >>>>> architecture is possible to implement by a plugin. >>>>> >>>>> Jörg >>>>> >>>>> On Mon, Nov 10, 2014 at 10:13 PM, Amish Asthana <asthan...@gmail.com> >>>>> wrote: >>>>> >>>>>> Hi >>>>>> Is there a way we can decouple data and associated mapping/indexing >>>>>> in Elasticsearch itself. >>>>>> Basically store the raw data as source( json or some other format) >>>>>> and various mapping/index can be used on top of that. >>>>>> I understand that one can use an outside database or file system, but >>>>>> can it be natively achieved in ES itself. >>>>>> >>>>>> Basically we are trying to see how our ES instance will work when we >>>>>> have to change mapping of existing and continuously incoming data without >>>>>> any downtime for the end user. >>>>>> We have an added wrinkle that our indexing has to be edit aware for >>>>>> versioning purpose; unlike ES where each edit is a new record. >>>>>> regards and thanks >>>>>> amish >>>>>> >>>>>> -- >>>>>> You received this message because you are subscribed to the Google >>>>>> Groups "elasticsearch" group. >>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>> send an email to elasticsearc...@googlegroups.com. >>>>>> To view this discussion on the web visit https://groups.google.com/d/ >>>>>> msgid/elasticsearch/0bb1f5ef-3991-4568-9891-018baf79ebae%40goo >>>>>> glegroups.com >>>>>> <https://groups.google.com/d/msgid/elasticsearch/0bb1f5ef-3991-4568-9891-018baf79ebae%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>> . >>>>>> For more options, visit https://groups.google.com/d/optout. >>>>>> >>>>> >>>>> -- >>> You received this message because you are subscribed to the Google >>> Groups "elasticsearch" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to elasticsearc...@googlegroups.com. >>> To view this discussion on the web visit https://groups.google.com/d/ >>> msgid/elasticsearch/4be01b3a-2747-4f6e-a1c3-7299e9f83bc4% >>> 40googlegroups.com >>> <https://groups.google.com/d/msgid/elasticsearch/4be01b3a-2747-4f6e-a1c3-7299e9f83bc4%40googlegroups.com?utm_medium=email&utm_source=footer> >>> . >>> >>> For more options, visit https://groups.google.com/d/optout. >>> >> >> -- > You received this message because you are subscribed to the Google Groups > "elasticsearch" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to elasticsearch+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/elasticsearch/367562df-b374-47e6-9bf2-53a1302f5a93%40googlegroups.com > <https://groups.google.com/d/msgid/elasticsearch/367562df-b374-47e6-9bf2-53a1302f5a93%40googlegroups.com?utm_medium=email&utm_source=footer> > . > > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGrxq0S5HcY8bwohqexPWqCTwR2DR521UUs_K-WsNqWiQ%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.