[ https://issues.apache.org/jira/browse/SOLR-9038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15262249#comment-15262249 ]
David Smiley commented on SOLR-9038: ------------------------------------ Committing with metadata: one thing that occurred to me is that currently a commit is short-circuited if there is no new data. But a commit with metadata needs be persisted (unless the metadata is identical). > Ability to create/delete/list snapshots for a solr collection > ------------------------------------------------------------- > > Key: SOLR-9038 > URL: https://issues.apache.org/jira/browse/SOLR-9038 > Project: Solr > Issue Type: New Feature > Components: SolrCloud > Reporter: Hrishikesh Gadre > > Currently work is under-way to implement backup/restore API for Solr cloud > (SOLR-5750). SOLR-5750 is about providing an ability to "copy" index files > and collection metadata to a configurable location. > In addition to this, we should also provide a facility to create "named" > snapshots for Solr collection. Here by "snapshot" I mean configuring the > underlying Lucene IndexDeletionPolicy to not delete a specific commit point > (e.g. using PersistentSnapshotIndexDeletionPolicy). This should not be > confused with SOLR-5340 which implements core level "backup" functionality. > The primary motivation of this feature is to decouple recording/preserving a > known consistent state of a collection from actually "copying" the relevant > files to a physically separate location. This decoupling have number of > advantages > - We can use specialized data-copying tools for transferring Solr index > files. e.g. in Hadoop environment, typically > [distcp|https://hadoop.apache.org/docs/r1.2.1/distcp2.html] tool is used to > copy files from one location to other. This tool provides various options to > configure degree of parallelism, bandwidth usage as well as integration with > different types and versions of file systems (e.g. AWS S3, Azure Blob store > etc.) > - This separation of concern would also help Solr to focus on the key > functionality (i.e. querying and indexing) while delegating the copy > operation to the tools built for that purpose. > - Users can decide if/when to copy the data files as against creating a > snapshot. e.g. a user may want to create a snapshot of a collection before > making an experimental change (e.g. updating/deleting docs, schema change > etc.). If the experiment is successful, he can delete the snapshot (without > having to copy the files). If the experiment is failed, then he can copy the > files associated with the snapshot and restore. > Note that Apache Blur project is also providing a similar feature > [BLUR-132|https://issues.apache.org/jira/browse/BLUR-132] -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org