Re: [sheepdog] [PATCH 1/2] dog: add a new option for reducing identical snapshots
At Thu, 12 Feb 2015 15:49:00 +0800, Liu Yuan wrote: On Thu, Feb 12, 2015 at 04:40:56PM +0900, Hitoshi Mitake wrote: At Thu, 12 Feb 2015 15:31:15 +0800, Liu Yuan wrote: On Thu, Feb 12, 2015 at 03:59:51PM +0900, Hitoshi Mitake wrote: At Thu, 12 Feb 2015 14:38:37 +0800, Liu Yuan wrote: On Mon, Feb 09, 2015 at 05:25:48PM +0900, Hitoshi Mitake wrote: Current dog vdi snapshot command creates a new snapshot unconditionally, even if a working VDI doesn't have its own objects. In such a case, the created snapshot is redundant because same VDI is already existing. What kind of use case will create two identical snapshots? This logic is simple and code is clean, but I doubt if there is real users of this option. Generally speaking, taking snapshot periodically is an ordinal usecase of enterprise SAN. Of course sheepdog can support this use case. In a case of sheepdog, making cron job (e.g. daily) which invokes dog vdi snapshot simply enables it. But if a VDI doesn't have COWed objects, the snapshot will be redundant. So I want to add this option. Okay, your patch makes sense for periodic snapshot. But if dog have found identical snapshots, it won't create a new one and return success to the caller. I assume the caller is some middleware, if there is no new vdi returned, will this cause trouble for it? This means it will need to call 'vdi list' to check if new vdi created or not? So I'm adding this feature with the new option. Existing semantics isn't affected. And if checking process (has_own_objects()) faces error, it is reported correctly to middleware. I'm not agasint this patch, but I have some questions. For identical snapshots, the overhead is just an inode object created, no? Looks to me the overhead is quite small and no need a special option to remove it. Taking snapshots of thousands of VDIs will consume thousands of VID, and create thousands * replication factor of inodes. I'm not sure the consumption of VID will become serious problem, but inodes will make replication time longer (e.g. 16:4 ec requires 20 inodes). Yes, this is the point. Make sense to me. I have some comments to the code in my last email. Could you submit a V2? BTW, it would be great if you can include above rationale into the commit log. OK, I'll send v2 later. Thanks, Hitoshi Thanks Yuan -- sheepdog mailing list sheepdog@lists.wpkg.org https://lists.wpkg.org/mailman/listinfo/sheepdog -- sheepdog mailing list sheepdog@lists.wpkg.org https://lists.wpkg.org/mailman/listinfo/sheepdog
Re: [sheepdog] [PATCH 1/2] dog: add a new option for reducing identical snapshots
On Thu, Feb 12, 2015 at 04:40:56PM +0900, Hitoshi Mitake wrote: At Thu, 12 Feb 2015 15:31:15 +0800, Liu Yuan wrote: On Thu, Feb 12, 2015 at 03:59:51PM +0900, Hitoshi Mitake wrote: At Thu, 12 Feb 2015 14:38:37 +0800, Liu Yuan wrote: On Mon, Feb 09, 2015 at 05:25:48PM +0900, Hitoshi Mitake wrote: Current dog vdi snapshot command creates a new snapshot unconditionally, even if a working VDI doesn't have its own objects. In such a case, the created snapshot is redundant because same VDI is already existing. What kind of use case will create two identical snapshots? This logic is simple and code is clean, but I doubt if there is real users of this option. Generally speaking, taking snapshot periodically is an ordinal usecase of enterprise SAN. Of course sheepdog can support this use case. In a case of sheepdog, making cron job (e.g. daily) which invokes dog vdi snapshot simply enables it. But if a VDI doesn't have COWed objects, the snapshot will be redundant. So I want to add this option. Okay, your patch makes sense for periodic snapshot. But if dog have found identical snapshots, it won't create a new one and return success to the caller. I assume the caller is some middleware, if there is no new vdi returned, will this cause trouble for it? This means it will need to call 'vdi list' to check if new vdi created or not? So I'm adding this feature with the new option. Existing semantics isn't affected. And if checking process (has_own_objects()) faces error, it is reported correctly to middleware. I'm not agasint this patch, but I have some questions. For identical snapshots, the overhead is just an inode object created, no? Looks to me the overhead is quite small and no need a special option to remove it. Taking snapshots of thousands of VDIs will consume thousands of VID, and create thousands * replication factor of inodes. I'm not sure the consumption of VID will become serious problem, but inodes will make replication time longer (e.g. 16:4 ec requires 20 inodes). Yes, this is the point. Make sense to me. I have some comments to the code in my last email. Could you submit a V2? BTW, it would be great if you can include above rationale into the commit log. Thanks Yuan -- sheepdog mailing list sheepdog@lists.wpkg.org https://lists.wpkg.org/mailman/listinfo/sheepdog
Re: [sheepdog] [PATCH 1/2] dog: add a new option for reducing identical snapshots
On Mon, Feb 09, 2015 at 05:25:48PM +0900, Hitoshi Mitake wrote: Current dog vdi snapshot command creates a new snapshot unconditionally, even if a working VDI doesn't have its own objects. In such a case, the created snapshot is redundant because same VDI is already existing. What kind of use case will create two identical snapshots? This logic is simple and code is clean, but I doubt if there is real users of this option. This patch adds a new option -R to the dog command for reducing the identical snapshots. Signed-off-by: Hitoshi Mitake mitake.hito...@lab.ntt.co.jp --- dog/vdi.c | 48 +++- 1 file changed, 47 insertions(+), 1 deletion(-) diff --git a/dog/vdi.c b/dog/vdi.c index 8e612af..ee465c2 100644 --- a/dog/vdi.c +++ b/dog/vdi.c @@ -40,6 +40,8 @@ static struct sd_option vdi_options[] = { neither comparing nor repairing}, {'z', block_size_shift, true, specify the bit shift num for data object size}, + {'R', reduce-identical-snapshots, false, do not create snapshot if + working VDI doesn't have its own objects}, { 0, NULL, false, NULL }, }; @@ -61,6 +63,7 @@ static struct vdi_cmd_data { uint64_t oid; bool no_share; bool exist; + bool reduce_identical_snapshots; } vdi_cmd_data = { ~0, }; struct get_vdi_info { @@ -605,6 +608,31 @@ fail: return NULL; } +static bool has_own_objects(uint32_t vid, int *ret) Traditionally, we'll have functions return SD_RES_xxx because in this way we could propragate the ret to upper callers. So it is better to have has_own_objects return SD_RES_xxx for consistency. Thanks, Yuan -- sheepdog mailing list sheepdog@lists.wpkg.org https://lists.wpkg.org/mailman/listinfo/sheepdog
Re: [sheepdog] [PATCH 1/2] dog: add a new option for reducing identical snapshots
At Thu, 12 Feb 2015 15:31:15 +0800, Liu Yuan wrote: On Thu, Feb 12, 2015 at 03:59:51PM +0900, Hitoshi Mitake wrote: At Thu, 12 Feb 2015 14:38:37 +0800, Liu Yuan wrote: On Mon, Feb 09, 2015 at 05:25:48PM +0900, Hitoshi Mitake wrote: Current dog vdi snapshot command creates a new snapshot unconditionally, even if a working VDI doesn't have its own objects. In such a case, the created snapshot is redundant because same VDI is already existing. What kind of use case will create two identical snapshots? This logic is simple and code is clean, but I doubt if there is real users of this option. Generally speaking, taking snapshot periodically is an ordinal usecase of enterprise SAN. Of course sheepdog can support this use case. In a case of sheepdog, making cron job (e.g. daily) which invokes dog vdi snapshot simply enables it. But if a VDI doesn't have COWed objects, the snapshot will be redundant. So I want to add this option. Okay, your patch makes sense for periodic snapshot. But if dog have found identical snapshots, it won't create a new one and return success to the caller. I assume the caller is some middleware, if there is no new vdi returned, will this cause trouble for it? This means it will need to call 'vdi list' to check if new vdi created or not? So I'm adding this feature with the new option. Existing semantics isn't affected. And if checking process (has_own_objects()) faces error, it is reported correctly to middleware. I'm not agasint this patch, but I have some questions. For identical snapshots, the overhead is just an inode object created, no? Looks to me the overhead is quite small and no need a special option to remove it. Taking snapshots of thousands of VDIs will consume thousands of VID, and create thousands * replication factor of inodes. I'm not sure the consumption of VID will become serious problem, but inodes will make replication time longer (e.g. 16:4 ec requires 20 inodes). Thanks, Hitoshi Thanks Yuan -- sheepdog mailing list sheepdog@lists.wpkg.org https://lists.wpkg.org/mailman/listinfo/sheepdog -- sheepdog mailing list sheepdog@lists.wpkg.org https://lists.wpkg.org/mailman/listinfo/sheepdog
Re: [sheepdog] [PATCH 1/2] dog: add a new option for reducing identical snapshots
At Thu, 12 Feb 2015 14:38:37 +0800, Liu Yuan wrote: On Mon, Feb 09, 2015 at 05:25:48PM +0900, Hitoshi Mitake wrote: Current dog vdi snapshot command creates a new snapshot unconditionally, even if a working VDI doesn't have its own objects. In such a case, the created snapshot is redundant because same VDI is already existing. What kind of use case will create two identical snapshots? This logic is simple and code is clean, but I doubt if there is real users of this option. Generally speaking, taking snapshot periodically is an ordinal usecase of enterprise SAN. Of course sheepdog can support this use case. In a case of sheepdog, making cron job (e.g. daily) which invokes dog vdi snapshot simply enables it. But if a VDI doesn't have COWed objects, the snapshot will be redundant. So I want to add this option. Of course, vdi list will provide information about the COWed object (used field). But vdi list is heavy operation in a cluster which has many VDIs because it issues bunch of read requests for inode headers. So avoiding vdi listing as much as possible is fine. Thanks, Hitoshi This patch adds a new option -R to the dog command for reducing the identical snapshots. Signed-off-by: Hitoshi Mitake mitake.hito...@lab.ntt.co.jp --- dog/vdi.c | 48 +++- 1 file changed, 47 insertions(+), 1 deletion(-) diff --git a/dog/vdi.c b/dog/vdi.c index 8e612af..ee465c2 100644 --- a/dog/vdi.c +++ b/dog/vdi.c @@ -40,6 +40,8 @@ static struct sd_option vdi_options[] = { neither comparing nor repairing}, {'z', block_size_shift, true, specify the bit shift num for data object size}, + {'R', reduce-identical-snapshots, false, do not create snapshot if +working VDI doesn't have its own objects}, { 0, NULL, false, NULL }, }; @@ -61,6 +63,7 @@ static struct vdi_cmd_data { uint64_t oid; bool no_share; bool exist; + bool reduce_identical_snapshots; } vdi_cmd_data = { ~0, }; struct get_vdi_info { @@ -605,6 +608,31 @@ fail: return NULL; } +static bool has_own_objects(uint32_t vid, int *ret) Traditionally, we'll have functions return SD_RES_xxx because in this way we could propragate the ret to upper callers. So it is better to have has_own_objects return SD_RES_xxx for consistency. Thanks, Yuan -- sheepdog mailing list sheepdog@lists.wpkg.org https://lists.wpkg.org/mailman/listinfo/sheepdog -- sheepdog mailing list sheepdog@lists.wpkg.org https://lists.wpkg.org/mailman/listinfo/sheepdog
Re: [sheepdog] [PATCH 1/2] dog: add a new option for reducing identical snapshots
On Thu, Feb 12, 2015 at 03:59:51PM +0900, Hitoshi Mitake wrote: At Thu, 12 Feb 2015 14:38:37 +0800, Liu Yuan wrote: On Mon, Feb 09, 2015 at 05:25:48PM +0900, Hitoshi Mitake wrote: Current dog vdi snapshot command creates a new snapshot unconditionally, even if a working VDI doesn't have its own objects. In such a case, the created snapshot is redundant because same VDI is already existing. What kind of use case will create two identical snapshots? This logic is simple and code is clean, but I doubt if there is real users of this option. Generally speaking, taking snapshot periodically is an ordinal usecase of enterprise SAN. Of course sheepdog can support this use case. In a case of sheepdog, making cron job (e.g. daily) which invokes dog vdi snapshot simply enables it. But if a VDI doesn't have COWed objects, the snapshot will be redundant. So I want to add this option. Okay, your patch makes sense for periodic snapshot. But if dog have found identical snapshots, it won't create a new one and return success to the caller. I assume the caller is some middleware, if there is no new vdi returned, will this cause trouble for it? This means it will need to call 'vdi list' to check if new vdi created or not? I'm not agasint this patch, but I have some questions. For identical snapshots, the overhead is just an inode object created, no? Looks to me the overhead is quite small and no need a special option to remove it. Thanks Yuan -- sheepdog mailing list sheepdog@lists.wpkg.org https://lists.wpkg.org/mailman/listinfo/sheepdog
[sheepdog] [PATCH 1/2] dog: add a new option for reducing identical snapshots
Current dog vdi snapshot command creates a new snapshot unconditionally, even if a working VDI doesn't have its own objects. In such a case, the created snapshot is redundant because same VDI is already existing. This patch adds a new option -R to the dog command for reducing the identical snapshots. Signed-off-by: Hitoshi Mitake mitake.hito...@lab.ntt.co.jp --- dog/vdi.c | 48 +++- 1 file changed, 47 insertions(+), 1 deletion(-) diff --git a/dog/vdi.c b/dog/vdi.c index 8e612af..ee465c2 100644 --- a/dog/vdi.c +++ b/dog/vdi.c @@ -40,6 +40,8 @@ static struct sd_option vdi_options[] = { neither comparing nor repairing}, {'z', block_size_shift, true, specify the bit shift num for data object size}, + {'R', reduce-identical-snapshots, false, do not create snapshot if +working VDI doesn't have its own objects}, { 0, NULL, false, NULL }, }; @@ -61,6 +63,7 @@ static struct vdi_cmd_data { uint64_t oid; bool no_share; bool exist; + bool reduce_identical_snapshots; } vdi_cmd_data = { ~0, }; struct get_vdi_info { @@ -605,6 +608,31 @@ fail: return NULL; } +static bool has_own_objects(uint32_t vid, int *ret) +{ + struct sd_inode *inode; + bool result = true; + + inode = xzalloc(sizeof(*inode)); + + *ret = dog_read_object(vid_to_vdi_oid(vid), inode, + sizeof(*inode), 0, true); + if (*ret != SD_RES_SUCCESS) + goto out; + + for (int i = 0; i SD_INODE_DATA_INDEX; i++) { + if (inode-data_vdi_id[i] inode-data_vdi_id[i] == vid) + /* VDI has its own object */ + goto out; + } + + result = false; + +out: + free(inode); + return result; +} + static int vdi_snapshot(int argc, char **argv) { const char *vdiname = argv[optind++]; @@ -703,6 +731,22 @@ static int vdi_snapshot(int argc, char **argv) nr_issued_prevent_inode_update++; } + if (vdi_cmd_data.reduce_identical_snapshots) { + bool result = has_own_objects(vid, ret); + + if (ret != SD_RES_SUCCESS) + goto out; + + if (!result) { + if (verbose) + sd_info(VDI %s doesn't have its own objects, + skipping creation of snapshot, + vdiname); + + goto out; + } + } + ret = dog_write_object(vid_to_vdi_oid(vid), 0, vdi_cmd_data.snapshot_tag, SD_MAX_VDI_TAG_LEN, @@ -3057,7 +3101,7 @@ static struct subcommand vdi_cmd[] = { {create, vdiname size, PycaphrvzT, create an image, NULL, CMD_NEED_NODELIST|CMD_NEED_ARG, vdi_create, vdi_options}, - {snapshot, vdiname, saphrvT, create a snapshot, + {snapshot, vdiname, saphrvTR, create a snapshot, NULL, CMD_NEED_ARG, vdi_snapshot, vdi_options}, {clone, src vdi dst vdi, sPnaphrvT, clone an image, @@ -3222,6 +3266,8 @@ static int vdi_parser(int ch, const char *opt) } vdi_cmd_data.block_size_shift = block_size_shift; break; + case 'R': + vdi_cmd_data.reduce_identical_snapshots = true; } return 0; -- 1.9.1 -- sheepdog mailing list sheepdog@lists.wpkg.org https://lists.wpkg.org/mailman/listinfo/sheepdog