Thanks for your reply, I'll have a look into this.

On Friday, 13 March 2015 17:41:49 UTC, Aaron Mefford wrote:
>
> Weird that was the post I made yesterday morning that just now hit the 
> list after vanishing.
>
> On Thu, Mar 12, 2015 at 10:21 AM, <aa...@definemg.com <javascript:>> 
> wrote:
>
>> I switched to using aliases about a year ago and I love it.  I am able to 
>> rebuild in the background and make a clean cutover once the process 
>> completes.
>>
>> Here are a couple of thoughts for your situation.  
>>
>> First create a second index that has the same format as your original.  
>> When you are ready to start creating your final index, stop indexing to 
>> your original and start indexing into this new index.  Queries to both 
>> indexes can be accomplished using a new alias, or by modifying the requests 
>> to include both.  Now you can transfer the bulk of your data from 
>> workshop_index_v1 to workshop_index_v2 while workshop_index_v1 new 
>> continues to collect the new documents.  Once the initial scan and scroll 
>> completes, you can cut over to workshop_index_v2 and run a scan and scroll 
>> against the v1_new index, which should be relatively small and allow you to 
>> quickly transfer those into your v2 schema.
>>
>> The alternative is to run the scan and scroll twice against the v1 
>> index.  Once to build the v2 index, at which point you cut to v2.  The 
>> second time to pick up any documents that were added after you started your 
>> initial scan and scroll.  This is a less than ideal scenario, will take 
>> longer, and will result in an index with many deletes, without additional 
>> steps to check to see if documents already exist.  If you have a timestamp 
>> in your documents, you might be able to make this reasonable.  You will 
>> certainly want to optimize after you complete this process.
>>
>> The only downside to writing to the new one, is which one do you query 
>> during the transition.  If you write to the v2 index, queries to v1 will 
>> not show new data, while queries to v2 will only show new data until the 
>> migration progresses.  Queries that span both may be complicated as the 
>> mappings are different, if that is not the case then yes this is the easy 
>> way.  If you are ok with one of the caveats, then by all means this is the 
>> simplest route.
>>
>> Aaron
>>
>> On Wednesday, March 11, 2015 at 10:47:59 AM UTC-6, mzrth_7810 wrote:
>>>
>>> Hey everyone,
>>>
>>> I have a question about rebuilding an index. After reading the 
>>> elasticsearch guide and various topics here I've found that the best 
>>> practice for rebuilding an index without any downtime is by using aliases. 
>>> However, there are certain steps and processes around that, which I seek 
>>> advice for. First I'm going to take you through an example scenario, and 
>>> then I'll have some questions.
>>>
>>> For example, you have "workshop_index_v1", with an alias "workshop". The 
>>> "workshop_index_v1" has a type called "guitar" which has three properties 
>>> with the following mapping:
>>>
>>> "identifier" : "string"
>>> "make" : "string"
>>> "model" : "string"
>>>
>>> Lets assume there is a lot of data in workshop_index_v1/guitar at the 
>>> moment, which has been populated from a separate database.
>>>
>>> Now, I need to modify the mapping, because I've changed the source data, 
>>> I would like get rid of the "identifier" property, so my mapping becomes:
>>>
>>> "make" : "string"
>>> "model" : "string"
>>>
>>> As we all know elasticsearch does not allow you to remove a property in 
>>> the mapping directly, you inevitably have to rebuild the index, which is 
>>> fine in my case.
>>>
>>> So now a few things came to mind when I thought how to do this:
>>>
>>>    - Create another index "workshop_index_v2", populate it with the 
>>>    data in "workshop_index_v1" using scroll and scan with the bulk API and 
>>>    later remove "workshop_index_v1" and add "workshop_index_v2" to the 
>>> alias.
>>>    - This will not work because the incorrect mapping(or a field value 
>>>       in the incorrect mapping) is already present in  "workshop_index_v1", 
>>> I do 
>>>       not want to copy everything as is.
>>>    - Create another index "workshop_index_v2", populate it with the 
>>>    data from the original source
>>>       - This works
>>>    
>>> One of the big issues here is, what happens to write requests while the 
>>> new index is being rebuilt.
>>>
>>> As you can only write to one index, which one do you write to, the old 
>>> one or the new one, or both?
>>>
>>> I feel, that writing to the new one, would work. I am beginner when it 
>>> comes to elasticsearch, any advice regarding any of this would be greatly 
>>> appreciated.
>>>
>>> Best regards
>>>
>>  -- 
>> You received this message because you are subscribed to a topic in the 
>> Google Groups "elasticsearch" group.
>> To unsubscribe from this topic, visit 
>> https://groups.google.com/d/topic/elasticsearch/U40jRfvA-ZM/unsubscribe.
>> To unsubscribe from this group and all its topics, send an email to 
>> elasticsearc...@googlegroups.com <javascript:>.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/c1a1f011-4d4f-4dba-b7f5-6899d4fe671e%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/elasticsearch/c1a1f011-4d4f-4dba-b7f5-6899d4fe671e%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/7be18d13-5f72-4470-8930-dd3e33fa7266%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to