Hi, I work on a complex workflow using Spark (Parsing, Cleaning, Machine Learning....). At the end of the workflow I want to send aggregated results to elasticsearch so my portal could query data. There will be two types of processing: streaming and the possibility to relaunch workflow on all available data.
Right now I use elasticsearch-hadoop and particularly the spark part to send document to elasticsearch with the saveJsonToEs(myindex, mytype) method. The target is to have an index by day using the proper template that we build. AFAIK you could not add consideration of a feature in a document to send it to the proper index in elasticsearch-hadoop. What is the proper way to implement this feature? Have a special step useing spark and bulk so that each executor send documents to the proper index considering the feature of each line? Is there something that I missed in elasticsearch-hadoop? Julien -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/58b0e0e3-a297-4cf4-95bf-d3cf34546ea3%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.