[ https://issues.apache.org/jira/browse/HBASE-15469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15199615#comment-15199615 ]
Matteo Bertozzi commented on HBASE-15469: ----------------------------------------- what happens if you try to clone or restore? is the table created with all the families but those not in the snapshot will be empty? > Take snapshot by family > ----------------------- > > Key: HBASE-15469 > URL: https://issues.apache.org/jira/browse/HBASE-15469 > Project: HBase > Issue Type: Improvement > Components: snapshots > Affects Versions: 2.0.0 > Reporter: Jianwei Cui > Attachments: HBASE-15469-v1.patch > > > In our production environment, there are some 'wide' tables in offline > cluster. The 'wide' table has a number of families, different applications > will access different families of the table through MapReduce. When some > application starting to provide online service, we need to copy needed > families from offline cluster to online cluster. For future write, the > inter-cluster replication supports setting families for table, we can use it > to copy future edits for needed families. For existed data, we can take > snapshot of the table on offline cluster, then exploit {{ExportSnapshot}} to > copy snapshot to online cluster and clone the snapshot. However, we can only > take snapshot for the whole table in which many families are not needed for > the application, this will lead unnecessary data copy. I think it is useful > to support taking snapshot by family, so that we can only copy needed data. > Possible solution to support such function: > 1. Add family names field to the protobuf definition of > {{SnapshotDescription}} > 2. Allow to set families when taking snapshot in hbase shell, such as: > {code} > snapshot 'tableName', 'snapshotName', 'FamilyA', 'FamilyB', {SKIP_FLUSH => > true} > {code} > 3. Add family names to {{SnapshotDescription}} in client side > 4. Read family names from {{SnapshotDescription}} in Master/Regionserver, > keep only requested families when taking snapshot for region. > Discussions and suggestions are welcomed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)