[jira] [Commented] (SOLR-11127) Add a Collections API command to migrate the .system collection schema from Trie-based (pre-7.0) to Points-based (7.0+)

2019-05-16 Thread David Smiley (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-11127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16841666#comment-16841666
 ] 

David Smiley commented on SOLR-11127:
-

You've been working on some cool and useful issues [~ab] – kudos!

I want to mention a couple suggestions to improve this feature in the future:
 * Making the source collection read-only might be inconvenient or infeasible 
for some apps.  As an option, a best-effort attempt would be useful.  Even if 
some changes don't make it, the client may already have a means of detecting 
data that needs to be resent, such as using a strategy involving looking at the 
highest timestamp. Or it may simply not matter, like for an experiment on the 
target.
 * IMO {{batchSize}} would have been a more appropriate name for {{rows}} 
param, which as it stands appears to be something that limits the reindexing to 
just this number of documents. After all, you used the same param name that we 
are all intimately familiar with for /select uses. I see this use of "rows" was 
in turn used by topic() but that's the same issue there. Ah well; many users 
won't touch this any way.

Also I suggest re-titling this issue to reflect your commit message – 
"REINDEXCOLLECTION command for re-indexing of existing collections.", not the 
original goal.

> Add a Collections API command to migrate the .system collection schema from 
> Trie-based (pre-7.0) to Points-based (7.0+)
> ---
>
> Key: SOLR-11127
> URL: https://issues.apache.org/jira/browse/SOLR-11127
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Steve Rowe
>Assignee: Andrzej Bialecki 
>Priority: Blocker
>  Labels: numeric-tries-to-points
> Fix For: 8.1, master (9.0)
>
> Attachments: SOLR-11127.patch, SOLR-11127.patch, SOLR-11127.patch, 
> SOLR-11127.patch
>
>
> SOLR-9 will switch the Trie fieldtypes in the .system collection's schema 
> to Points.
> Users with pre-7.0 .system collections will no longer be able to use them 
> once Trie fields have been removed (8.0).
> Solr should provide a Collections API command MIGRATESYSTEMCOLLECTION to 
> automatically convert a Trie-based .system collection to a Points-based one.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11127) Add a Collections API command to migrate the .system collection schema from Trie-based (pre-7.0) to Points-based (7.0+)

2019-03-19 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-11127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16796080#comment-16796080
 ] 

ASF subversion and git services commented on SOLR-11127:


Commit b778417054e735cf323139a43e84d6262ce9dcd7 in lucene-solr's branch 
refs/heads/branch_8x from Andrzej Bialecki
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=b778417 ]

SOLR-11127: REINDEXCOLLECTION command for re-indexing of existing collections.


> Add a Collections API command to migrate the .system collection schema from 
> Trie-based (pre-7.0) to Points-based (7.0+)
> ---
>
> Key: SOLR-11127
> URL: https://issues.apache.org/jira/browse/SOLR-11127
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Steve Rowe
>Assignee: Andrzej Bialecki 
>Priority: Blocker
>  Labels: numeric-tries-to-points
> Fix For: 8.1, master (9.0)
>
> Attachments: SOLR-11127.patch, SOLR-11127.patch, SOLR-11127.patch, 
> SOLR-11127.patch
>
>
> SOLR-9 will switch the Trie fieldtypes in the .system collection's schema 
> to Points.
> Users with pre-7.0 .system collections will no longer be able to use them 
> once Trie fields have been removed (8.0).
> Solr should provide a Collections API command MIGRATESYSTEMCOLLECTION to 
> automatically convert a Trie-based .system collection to a Points-based one.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11127) Add a Collections API command to migrate the .system collection schema from Trie-based (pre-7.0) to Points-based (7.0+)

2019-03-19 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-11127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16796048#comment-16796048
 ] 

ASF subversion and git services commented on SOLR-11127:


Commit 6f2b7bf5c0144f19572b54eed4fc340c13cf8c2a in lucene-solr's branch 
refs/heads/master from Andrzej Bialecki
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=6f2b7bf ]

SOLR-11127: REINDEXCOLLECTION command for re-indexing of existing collections.


> Add a Collections API command to migrate the .system collection schema from 
> Trie-based (pre-7.0) to Points-based (7.0+)
> ---
>
> Key: SOLR-11127
> URL: https://issues.apache.org/jira/browse/SOLR-11127
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Steve Rowe
>Assignee: Andrzej Bialecki 
>Priority: Blocker
>  Labels: numeric-tries-to-points
> Fix For: 8.1, master (9.0)
>
> Attachments: SOLR-11127.patch, SOLR-11127.patch, SOLR-11127.patch, 
> SOLR-11127.patch
>
>
> SOLR-9 will switch the Trie fieldtypes in the .system collection's schema 
> to Points.
> Users with pre-7.0 .system collections will no longer be able to use them 
> once Trie fields have been removed (8.0).
> Solr should provide a Collections API command MIGRATESYSTEMCOLLECTION to 
> automatically convert a Trie-based .system collection to a Points-based one.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11127) Add a Collections API command to migrate the .system collection schema from Trie-based (pre-7.0) to Points-based (7.0+)

2019-03-18 Thread Andrzej Bialecki (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-11127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16795401#comment-16795401
 ] 

Andrzej Bialecki  commented on SOLR-11127:
--

The latest patch. This includes a {{.system}} compatibility check that is 
performed on {{Overseer}} leader startup. This verification only logs a warning 
about the potentially incompatible index data, providing details of schema 
fields that are likely incompatible. This should provide sufficient information 
for users to decide whether to re-index the collection.

If there are no objections I'd like to commit this shortly.

> Add a Collections API command to migrate the .system collection schema from 
> Trie-based (pre-7.0) to Points-based (7.0+)
> ---
>
> Key: SOLR-11127
> URL: https://issues.apache.org/jira/browse/SOLR-11127
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Steve Rowe
>Assignee: Andrzej Bialecki 
>Priority: Blocker
>  Labels: numeric-tries-to-points
> Fix For: 8.1, master (9.0)
>
> Attachments: SOLR-11127.patch, SOLR-11127.patch, SOLR-11127.patch, 
> SOLR-11127.patch
>
>
> SOLR-9 will switch the Trie fieldtypes in the .system collection's schema 
> to Points.
> Users with pre-7.0 .system collections will no longer be able to use them 
> once Trie fields have been removed (8.0).
> Solr should provide a Collections API command MIGRATESYSTEMCOLLECTION to 
> automatically convert a Trie-based .system collection to a Points-based one.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11127) Add a Collections API command to migrate the .system collection schema from Trie-based (pre-7.0) to Points-based (7.0+)

2019-03-13 Thread Andrzej Bialecki (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-11127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16792125#comment-16792125
 ] 

Andrzej Bialecki  commented on SOLR-11127:
--

Another update - support for checking status and progress of reindexing, 
RefGuide documentation.

> Add a Collections API command to migrate the .system collection schema from 
> Trie-based (pre-7.0) to Points-based (7.0+)
> ---
>
> Key: SOLR-11127
> URL: https://issues.apache.org/jira/browse/SOLR-11127
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Steve Rowe
>Assignee: Andrzej Bialecki 
>Priority: Blocker
>  Labels: numeric-tries-to-points
> Fix For: 8.0
>
> Attachments: SOLR-11127.patch, SOLR-11127.patch, SOLR-11127.patch
>
>
> SOLR-9 will switch the Trie fieldtypes in the .system collection's schema 
> to Points.
> Users with pre-7.0 .system collections will no longer be able to use them 
> once Trie fields have been removed (8.0).
> Solr should provide a Collections API command MIGRATESYSTEMCOLLECTION to 
> automatically convert a Trie-based .system collection to a Points-based one.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11127) Add a Collections API command to migrate the .system collection schema from Trie-based (pre-7.0) to Points-based (7.0+)

2019-03-12 Thread Andrzej Bialecki (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-11127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16790861#comment-16790861
 ] 

Andrzej Bialecki  commented on SOLR-11127:
--

Updated patch, with a lot more internal error checking and additional unit 
tests. I think this is fairly complete in functionality, more documentation to 
follow soon.


> Add a Collections API command to migrate the .system collection schema from 
> Trie-based (pre-7.0) to Points-based (7.0+)
> ---
>
> Key: SOLR-11127
> URL: https://issues.apache.org/jira/browse/SOLR-11127
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Steve Rowe
>Assignee: Andrzej Bialecki 
>Priority: Blocker
>  Labels: numeric-tries-to-points
> Fix For: 8.0
>
> Attachments: SOLR-11127.patch, SOLR-11127.patch
>
>
> SOLR-9 will switch the Trie fieldtypes in the .system collection's schema 
> to Points.
> Users with pre-7.0 .system collections will no longer be able to use them 
> once Trie fields have been removed (8.0).
> Solr should provide a Collections API command MIGRATESYSTEMCOLLECTION to 
> automatically convert a Trie-based .system collection to a Points-based one.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11127) Add a Collections API command to migrate the .system collection schema from Trie-based (pre-7.0) to Points-based (7.0+)

2019-02-28 Thread Andrzej Bialecki (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-11127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16780903#comment-16780903
 ] 

Andrzej Bialecki  commented on SOLR-11127:
--

This patch implements a REINDEX_COLLECTION command (NOTE: it depends on the 
changes in SOLR-13271). It uses the procedure described above. A daemon 
streaming expression is used for copying documents between collections.

The new command supports reindexing any collection, with the usual caveats 
about potential data loss, and it supports the following:
* different or the same source and target collection name (by using aliases, as 
described above)
* most collection CREATE parameters are supported too, which allows re-shaping 
the collection (eg. changing the number of shards, the router, etc)

Comments and review very appreciated!

> Add a Collections API command to migrate the .system collection schema from 
> Trie-based (pre-7.0) to Points-based (7.0+)
> ---
>
> Key: SOLR-11127
> URL: https://issues.apache.org/jira/browse/SOLR-11127
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Steve Rowe
>Assignee: Andrzej Bialecki 
>Priority: Blocker
>  Labels: numeric-tries-to-points
> Fix For: 8.0
>
> Attachments: SOLR-11127.patch
>
>
> SOLR-9 will switch the Trie fieldtypes in the .system collection's schema 
> to Points.
> Users with pre-7.0 .system collections will no longer be able to use them 
> once Trie fields have been removed (8.0).
> Solr should provide a Collections API command MIGRATESYSTEMCOLLECTION to 
> automatically convert a Trie-based .system collection to a Points-based one.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11127) Add a Collections API command to migrate the .system collection schema from Trie-based (pre-7.0) to Points-based (7.0+)

2019-02-25 Thread Andrzej Bialecki (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-11127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16777262#comment-16777262
 ] 

Andrzej Bialecki  commented on SOLR-11127:
--

Implementing a read-only mode for a collection allows us to use a better 
solution to this problem:
 * create a new unique collection using the new schema, eg. 
{{.reindex__}}
 * put the source collection in read-only mode. This entails:
 ** blocking new updates,
 ** issuing a hard commit
 ** closing the IndexWriter to make sure there aren't any ongoing background 
merges.
 * copy all documents from source to the new collection
 * create an alias pointing from the source name to the new collection. The new 
collection is already in read-write mode by default, and this operation is 
atomic.
 * optionally delete the original source

In this scenario we never lose the ability to search the source collection, at 
the cost of losing the ability to process updates during the reindexing.

BTW. this scenario is applicable to basically any collection, not just the 
{{.system}}, with the usual caveats about potentially losing the data from 
document fields that can't be retrieved from the source collection.

> Add a Collections API command to migrate the .system collection schema from 
> Trie-based (pre-7.0) to Points-based (7.0+)
> ---
>
> Key: SOLR-11127
> URL: https://issues.apache.org/jira/browse/SOLR-11127
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Steve Rowe
>Assignee: Andrzej Bialecki 
>Priority: Blocker
>  Labels: numeric-tries-to-points
> Fix For: 8.0
>
>
> SOLR-9 will switch the Trie fieldtypes in the .system collection's schema 
> to Points.
> Users with pre-7.0 .system collections will no longer be able to use them 
> once Trie fields have been removed (8.0).
> Solr should provide a Collections API command MIGRATESYSTEMCOLLECTION to 
> automatically convert a Trie-based .system collection to a Points-based one.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11127) Add a Collections API command to migrate the .system collection schema from Trie-based (pre-7.0) to Points-based (7.0+)

2019-01-29 Thread JIRA


[ 
https://issues.apache.org/jira/browse/SOLR-11127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16755022#comment-16755022
 ] 

Jan Høydahl commented on SOLR-11127:


How to handle the two time gaps when .system will return 0 hits during copying?

Let's say we add a config option to configure {{BlobHandler}} and 
{{UpdateRequestHandler}} into R/O mode (readOnly=true) where update requests 
return HTTP 503 Service Unavailable. Then we could start by setting .system in 
R/O and then safely copy back and forth and move alias only when copy is 
complete, then at the end set .system back to readOnly=false and RELOAD .system 
collection to get back to normal operation. Don't know how much work that would 
be, sounds doable.

> Add a Collections API command to migrate the .system collection schema from 
> Trie-based (pre-7.0) to Points-based (7.0+)
> ---
>
> Key: SOLR-11127
> URL: https://issues.apache.org/jira/browse/SOLR-11127
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Steve Rowe
>Assignee: Andrzej Bialecki 
>Priority: Blocker
>  Labels: numeric-tries-to-points
> Fix For: 8.0
>
>
> SOLR-9 will switch the Trie fieldtypes in the .system collection's schema 
> to Points.
> Users with pre-7.0 .system collections will no longer be able to use them 
> once Trie fields have been removed (8.0).
> Solr should provide a Collections API command MIGRATESYSTEMCOLLECTION to 
> automatically convert a Trie-based .system collection to a Points-based one.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11127) Add a Collections API command to migrate the .system collection schema from Trie-based (pre-7.0) to Points-based (7.0+)

2019-01-29 Thread Andrzej Bialecki (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-11127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16755007#comment-16755007
 ] 

Andrzej Bialecki  commented on SOLR-11127:
--

My plan of attack is to implement a collection command that orchestrates the 
following steps:
 * create a temporary collection with a unique name, eg. {{tmpCollection_123}}, 
using the updated {{.system}} schema
 * define an alias that points {{.system -> tmpCollection_123}}. This should 
redirect all updates and queries to the temp collection.
 * copy the documents from {{.system}} to the temp collection, avoiding 
overwriting updated docs (incremental updates won't work during this process, 
but AFAIK no Solr component uses incremental updates when indexing to 
{{.system}})
 * delete the original {{.system}} and create it again using the updated schema.
 * remove the alias
 * copy over the documents from temporary collection to {{.system}}, again 
avoiding overwrites.

The collection command will take care of async processing, resuming the 
operation on Overseer restarts, etc.

Comments and feedback are welcome.

(Also, given that the 8.0 release is imminent I'm not sure I can fix this in 
time for the 8.0 release.)

> Add a Collections API command to migrate the .system collection schema from 
> Trie-based (pre-7.0) to Points-based (7.0+)
> ---
>
> Key: SOLR-11127
> URL: https://issues.apache.org/jira/browse/SOLR-11127
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Steve Rowe
>Assignee: Andrzej Bialecki 
>Priority: Blocker
>  Labels: numeric-tries-to-points
> Fix For: 8.0
>
>
> SOLR-9 will switch the Trie fieldtypes in the .system collection's schema 
> to Points.
> Users with pre-7.0 .system collections will no longer be able to use them 
> once Trie fields have been removed (8.0).
> Solr should provide a Collections API command MIGRATESYSTEMCOLLECTION to 
> automatically convert a Trie-based .system collection to a Points-based one.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11127) Add a Collections API command to migrate the .system collection schema from Trie-based (pre-7.0) to Points-based (7.0+)

2019-01-03 Thread JIRA


[ 
https://issues.apache.org/jira/browse/SOLR-11127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16733325#comment-16733325
 ] 

Jan Høydahl commented on SOLR-11127:


Anyone planning to look into this for 8.0?

> Add a Collections API command to migrate the .system collection schema from 
> Trie-based (pre-7.0) to Points-based (7.0+)
> ---
>
> Key: SOLR-11127
> URL: https://issues.apache.org/jira/browse/SOLR-11127
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Steve Rowe
>Priority: Blocker
>  Labels: numeric-tries-to-points
> Fix For: master (8.0)
>
>
> SOLR-9 will switch the Trie fieldtypes in the .system collection's schema 
> to Points.
> Users with pre-7.0 .system collections will no longer be able to use them 
> once Trie fields have been removed (8.0).
> Solr should provide a Collections API command MIGRATESYSTEMCOLLECTION to 
> automatically convert a Trie-based .system collection to a Points-based one.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11127) Add a Collections API command to migrate the .system collection schema from Trie-based (pre-7.0) to Points-based (7.0+)

2018-06-13 Thread JIRA


[ 
https://issues.apache.org/jira/browse/SOLR-11127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16510799#comment-16510799
 ] 

Jan Høydahl commented on SOLR-11127:


Perhaps also that there should be a check on system startup in version 7.x 
which logs an ERROR log line if the system collection is not converted so 
people are alerted of the need before it's too late? That could be a new Jira 
issue for 7.5?

> Add a Collections API command to migrate the .system collection schema from 
> Trie-based (pre-7.0) to Points-based (7.0+)
> ---
>
> Key: SOLR-11127
> URL: https://issues.apache.org/jira/browse/SOLR-11127
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Steve Rowe
>Priority: Blocker
>  Labels: numeric-tries-to-points
> Fix For: master (8.0)
>
>
> SOLR-9 will switch the Trie fieldtypes in the .system collection's schema 
> to Points.
> Users with pre-7.0 .system collections will no longer be able to use them 
> once Trie fields have been removed (8.0).
> Solr should provide a Collections API command MIGRATESYSTEMCOLLECTION to 
> automatically convert a Trie-based .system collection to a Points-based one.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org