Hrishikesh Gadre created SOLR-9038:
--------------------------------------

             Summary: Ability to create/delete/list snapshots for a solr 
collection
                 Key: SOLR-9038
                 URL: https://issues.apache.org/jira/browse/SOLR-9038
             Project: Solr
          Issue Type: New Feature
          Components: SolrCloud
            Reporter: Hrishikesh Gadre


Currently work is under-way to implement backup/restore API for Solr cloud 
(SOLR-5750). SOLR-5750 is about providing an ability to "copy" index files and 
collection metadata to a configurable location. 

In addition to this, we should also provide a facility to create "named" 
snapshots for Solr collection. Here by "snapshot" I mean configuring the 
underlying Lucene IndexDeletionPolicy to not delete a specific commit point 
(e.g. using PersistentSnapshotIndexDeletionPolicy). This should not be confused 
with SOLR-5340 which implements core level "backup" functionality.

The primary motivation of this feature is to decouple recording/preserving a 
known consistent state of a collection from actually "copying" the relevant 
files to a physically separate location. This decoupling have number of 
advantages
- We can use specialized data-copying tools for transferring Solr index files. 
e.g. in Hadoop environment, typically 
[distcp|https://hadoop.apache.org/docs/r1.2.1/distcp2.html] tool is used to 
copy files from one location to other. This tool provides various options to 
configure degree of parallelism, bandwidth usage as well as integration with 
different types and versions of file systems (e.g. AWS S3, Azure Blob store 
etc.)
- This separation of concern would also help Solr to focus on the key 
functionality (i.e. querying and indexing) while delegating the copy operation 
to the tools built for that purpose.
- Users can decide if/when to copy the data files as against creating a 
snapshot. e.g. a user may want to create a snapshot of data before making an 
experimental change (e.g. updating/deleting docs, schema change etc.). If the 
experiment is successful, he can delete the snapshot (without having to copy 
the files). The experiment failed, then he can copy the files associated with 
the snapshot and restore from the snapshot.

Note that Apache Blur project is also providing a similar feature 
[BLUR-132|https://issues.apache.org/jira/browse/BLUR-132]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to