[ 
https://issues.apache.org/jira/browse/SOLR-11127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16755007#comment-16755007
 ] 

Andrzej Bialecki  edited comment on SOLR-11127 at 1/29/19 1:24 PM:
-------------------------------------------------------------------

My plan of attack is to implement a collection command that orchestrates the 
following steps:
 * create a temporary collection with a unique name, eg. {{tmpCollection_123}}, 
using the updated {{.system}} schema
 * define an alias that points {{.system -> tmpCollection_123}}. This should 
redirect all updates and queries to the temp collection.
 * copy the documents from {{.system}} to the temp collection, avoiding 
overwriting updated docs (incremental updates won't work during this process, 
but AFAIK no Solr component uses incremental updates when indexing to 
{{.system}})
 * delete the original {{.system}} and create it again using the updated schema.
 * remove the alias
 * copy over the documents from temporary collection to {{.system}}, again 
avoiding overwrites.

The collection command will take care of async processing, resuming the 
operation on Overseer restarts, etc.

I considered doing this as a sort of rolling in-place update but this wouldn't 
be any less expensive and I think it would have been impossible to do (and to 
get it right) - updated schema uses points instead of trie fields for the same 
fields.

Comments and feedback are welcome (thanks [~janhoy] for useful suggestions).

(Also, given that the 8.0 release is imminent I'm not sure I can fix this in 
time for the 8.0 release.)


was (Author: ab):
My plan of attack is to implement a collection command that orchestrates the 
following steps:
 * create a temporary collection with a unique name, eg. {{tmpCollection_123}}, 
using the updated {{.system}} schema
 * define an alias that points {{.system -> tmpCollection_123}}. This should 
redirect all updates and queries to the temp collection.
 * copy the documents from {{.system}} to the temp collection, avoiding 
overwriting updated docs (incremental updates won't work during this process, 
but AFAIK no Solr component uses incremental updates when indexing to 
{{.system}})
 * delete the original {{.system}} and create it again using the updated schema.
 * remove the alias
 * copy over the documents from temporary collection to {{.system}}, again 
avoiding overwrites.

The collection command will take care of async processing, resuming the 
operation on Overseer restarts, etc.

Comments and feedback are welcome.

(Also, given that the 8.0 release is imminent I'm not sure I can fix this in 
time for the 8.0 release.)

> Add a Collections API command to migrate the .system collection schema from 
> Trie-based (pre-7.0) to Points-based (7.0+)
> -----------------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-11127
>                 URL: https://issues.apache.org/jira/browse/SOLR-11127
>             Project: Solr
>          Issue Type: Task
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: Steve Rowe
>            Assignee: Andrzej Bialecki 
>            Priority: Blocker
>              Labels: numeric-tries-to-points
>             Fix For: 8.0
>
>
> SOLR-11119 will switch the Trie fieldtypes in the .system collection's schema 
> to Points.
> Users with pre-7.0 .system collections will no longer be able to use them 
> once Trie fields have been removed (8.0).
> Solr should provide a Collections API command MIGRATESYSTEMCOLLECTION to 
> automatically convert a Trie-based .system collection to a Points-based one.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to