[
https://issues.apache.org/jira/browse/SOLR-9055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15269135#comment-15269135
]
David Smiley commented on SOLR-9055:
------------------------------------
(p.s. use {{bq.}} to quote)
bq. (me) I have a general question about HDFS; I have no real experience with
it: I wonder if Java's NIO file abstractions could be used so we don't have to
have separate code? If so it would be wonderful – simpler and less code to
maintain. See https://github.com/damiencarol/jsr203-hadoop What do you think?
bq. (Gadre) Although integrating HDFS and Java NIO API sounds interesting, I
would prefer if it is directly provided by HDFS client library as against a
third party library which may/may not be supported in future. Also since Solr
provides a HDFS backed Directory implementation, it probably make sense to
reuse it.
Any thoughts on this one [[email protected]] or [~gchanan] perhaps?
bq. However if we want to keep things simple, we can choose to not provide
separate APIs to configure "repositories". Instead we can just pick the same
file-system used to store the indexed data. That means in case of local
file-system, the backup will be stored on shared file-system using
SimpleFSDirectory implementation AND for HDFS we will use HdfsDirectory impl.
Make sense?
I understand what you mean, but it seems a shame, and loses the extensibility
we want. I think what this comes down to is, should we re-use the Lucene
Directory API for moving data in/out of the backup location, or should we use
something else.
bq. I think the main problem here is identifying type of file-system used for a
given collection at the Overseer (The solr core on the other hand already has a
Directory factory reference. So we can instantiate appropriate directory in the
snapshooter).
It was suggested early in SOLR-5750 that the location param should have a
protocol/impl scheme URL prefix (assume {{file://}} if not specified). That
may help the Overseer? Or if you mean it needs to know the directory impl of
the live indexes well I imagine it could look this up in the same way that it
is done from Solr's admin screen (which shows the impl factory).
I doubt I'll have time to help much more here... I'm a bit behind on my work
load.
> Make collection backup/restore extensible
> -----------------------------------------
>
> Key: SOLR-9055
> URL: https://issues.apache.org/jira/browse/SOLR-9055
> Project: Solr
> Issue Type: Task
> Reporter: Hrishikesh Gadre
> Assignee: Mark Miller
> Attachments: SOLR-9055.patch
>
>
> SOLR-5750 implemented backup/restore API for Solr. This JIRA is to track the
> code cleanup/refactoring. Specifically following improvements should be made,
> - Add Solr/Lucene version to check the compatibility between the backup
> version and the version of Solr on which it is being restored.
> - Add a backup implementation version to check the compatibility between the
> "restore" implementation and backup format.
> - Introduce a Strategy interface to define how the Solr index data is backed
> up (e.g. using file copy approach).
> - Introduce a Repository interface to define the file-system used to store
> the backup data. (currently works only with local file system but can be
> extended). This should be enhanced to introduce support for "registering"
> repositories (e.g. HDFS, S3 etc.)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]