[
https://issues.apache.org/jira/browse/SOLR-7374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15334905#comment-15334905
]
Hrishikesh Gadre edited comment on SOLR-7374 at 6/16/16 11:12 PM:
------------------------------------------------------------------
bq. For the scope of this Jira can we just support it in ReplicationHandler as
well ?
Sure I am working on this. It looks like we may not be able to provide
identical behavior w.r.t. core level backup/restore API.
Specifically when user does not specify "location" parameter, the existing
ReplicationHandler implementation uses a directory relative to the "data"
directory. e.g.
https://github.com/apache/lucene-solr/blob/a4455a4b14f2bf947db1136f9d5fc7d0d88d32ef/solr/core/src/java/org/apache/solr/handler/ReplicationHandler.java#L419
https://github.com/apache/lucene-solr/blob/a4455a4b14f2bf947db1136f9d5fc7d0d88d32ef/solr/core/src/java/org/apache/solr/handler/SnapShooter.java#L67
While this logic is OK on a local file-system, it would not work if user is
using a different file-system for backup/restore. e.g. consider a case when a
user configures HDFS repository without a default location (and using local
file-system for storing index files). When only a single repository is
configured, we use it as a "default".
Now consider a case when a user invokes backup/restore without specifying
"location" and "repository" parameters, we don't want to use the "data"
directory as the location since it may not be valid on HDFS. So I am adding a
constraint that if "repository" parameter is specified then location must be
specified either via "location" parameter OR via a repository configuration in
solr.xml
When "repository" parameter is not specified, we default to "LocalFileSystem"
instead of configured default repository in solr.xml. This is to handle the
use-case mentioned above. It also helps to maintain the backwards compatibility
with the existing API behavior.
On the other hand the Core level BACKUP API always fetches the "default"
repository configuration from solr.xml and require that location be specified
either via "location" parameter OR via a repository configuration. I hope this
small difference in API behavior should be OK (since we should aim to retire
one of the APIs).
was (Author: hgadre):
bq. For the scope of this Jira can we just support it in ReplicationHandler as
well ?
It looks like we may not be able to provide identical behavior w.r.t. core
level backup/restore API. Specifically when user does not specify "location"
parameter, the existing implementation uses a directory relative to the "data"
directory. e.g.
https://github.com/apache/lucene-solr/blob/a4455a4b14f2bf947db1136f9d5fc7d0d88d32ef/solr/core/src/java/org/apache/solr/handler/ReplicationHandler.java#L419
https://github.com/apache/lucene-solr/blob/a4455a4b14f2bf947db1136f9d5fc7d0d88d32ef/solr/core/src/java/org/apache/solr/handler/SnapShooter.java#L67
While this logic is OK on a local file-system, it would not work if user is
using a different file-system for backup/restore. e.g. consider a case when a
user configures HDFS repository without a default location (and using local
file-system for storing index files). When only a single repository is
configured, we consider it as a "default".
Now consider a case when a user invokes backup/restore without specifying
"location" and "repository" parameters, we don't want to use the "data"
directory as the location since it may not be valid on HDFS. So I am adding a
constraint that if "repository" parameter is specified then location must be
specified either via "location" parameter OR via a repository configuration in
solr.xml
When "repository" parameter is not specified, we default to "LocalFileSystem"
instead of configured default repository in solr.xml. This is to
handle the use-case mentioned above. It also helps to maintain the backwards
compatibility with the existing API behavior. On the other hand the Core level
BACKUP API always fetches the "default" repository configuration from solr.xml
and require that location be specified either via "location" parameter OR via a
repository configuration. I hope this small difference in API behavior should
be OK (since we should aim to retire one of the APIs).
> Backup/Restore should provide a param for specifying the directory
> implementation it should use
> -----------------------------------------------------------------------------------------------
>
> Key: SOLR-7374
> URL: https://issues.apache.org/jira/browse/SOLR-7374
> Project: Solr
> Issue Type: Bug
> Reporter: Varun Thacker
> Assignee: Mark Miller
> Fix For: 5.2, 6.0
>
> Attachments: SOLR-7374.patch, SOLR-7374.patch, SOLR-7374.patch,
> SOLR-7374.patch, SOLR-7374.patch, SOLR-7374.patch
>
>
> Currently when we create a backup we use SimpleFSDirectory to write the
> backup indexes. Similarly during a restore we open the index using
> FSDirectory.open .
> We should provide a param called {{directoryImpl}} or {{type}} which will be
> used to specify the Directory implementation to backup the index.
> Likewise during a restore you would need to specify the directory impl which
> was used during backup so that the index can be opened correctly.
> This param will address the problem that currently if a user is running Solr
> on HDFS there is no way to use the backup/restore functionality as the
> directory is hardcoded.
> With this one could be running Solr on a local FS but backup the index on
> HDFS etc.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]