Re: [sheepdog] [PATCH 1/2] dog: add a new option for reducing identical snapshots

2015-02-12 Thread Hitoshi Mitake
At Thu, 12 Feb 2015 15:49:00 +0800,
Liu Yuan wrote:
 
 On Thu, Feb 12, 2015 at 04:40:56PM +0900, Hitoshi Mitake wrote:
  At Thu, 12 Feb 2015 15:31:15 +0800,
  Liu Yuan wrote:
   
   On Thu, Feb 12, 2015 at 03:59:51PM +0900, Hitoshi Mitake wrote:
At Thu, 12 Feb 2015 14:38:37 +0800,
Liu Yuan wrote:
 
 On Mon, Feb 09, 2015 at 05:25:48PM +0900, Hitoshi Mitake wrote:
  Current dog vdi snapshot command creates a new snapshot
  unconditionally, even if a working VDI doesn't have its own
  objects. In such a case, the created snapshot is redundant because
  same VDI is already existing.
 
 What kind of use case will create two identical snapshots? This logic 
 is simple
 and code is clean, but I doubt if there is real users of this option.

Generally speaking, taking snapshot periodically is an ordinal usecase
of enterprise SAN. Of course sheepdog can support this use case. In a
case of sheepdog, making cron job (e.g. daily) which invokes dog vdi
snapshot simply enables it.

But if a VDI doesn't have COWed objects, the snapshot will be
redundant. So I want to add this option.
   
   Okay, your patch makes sense for periodic snapshot. But if dog have found
   identical snapshots, it won't create a new one and return success to the 
   caller.
   
   I assume the caller is some middleware, if there is no new vdi returned, 
   will
   this cause trouble for it? This means it will need to call 'vdi list' to 
   check
   if new vdi created or not? 
  
  So I'm adding this feature with the new option. Existing semantics
  isn't affected. And if checking process (has_own_objects()) faces
  error, it is reported correctly to middleware.
  
   
   I'm not agasint this patch, but I have some questions. For identical 
   snapshots,
   the overhead is just an inode object created, no? Looks to me the 
   overhead is
   quite small and no need a special option to remove it.
  
  Taking snapshots of thousands of VDIs will consume thousands of VID,
  and create thousands * replication factor of inodes. I'm not sure the
  consumption of VID will become serious problem, but inodes will make
  replication time longer (e.g. 16:4 ec requires 20 inodes).
 
 Yes, this is the point. Make sense to me. I have some comments to the code in
 my last email. Could you submit a V2? BTW, it would be great if you can 
 include
 above rationale into the commit log.

OK, I'll send v2 later.

Thanks,
Hitoshi

 
 Thanks
 Yuan
 -- 
 sheepdog mailing list
 sheepdog@lists.wpkg.org
 https://lists.wpkg.org/mailman/listinfo/sheepdog
-- 
sheepdog mailing list
sheepdog@lists.wpkg.org
https://lists.wpkg.org/mailman/listinfo/sheepdog


Re: [sheepdog] [PATCH 1/2] dog: add a new option for reducing identical snapshots

2015-02-11 Thread Liu Yuan
On Thu, Feb 12, 2015 at 04:40:56PM +0900, Hitoshi Mitake wrote:
 At Thu, 12 Feb 2015 15:31:15 +0800,
 Liu Yuan wrote:
  
  On Thu, Feb 12, 2015 at 03:59:51PM +0900, Hitoshi Mitake wrote:
   At Thu, 12 Feb 2015 14:38:37 +0800,
   Liu Yuan wrote:

On Mon, Feb 09, 2015 at 05:25:48PM +0900, Hitoshi Mitake wrote:
 Current dog vdi snapshot command creates a new snapshot
 unconditionally, even if a working VDI doesn't have its own
 objects. In such a case, the created snapshot is redundant because
 same VDI is already existing.

What kind of use case will create two identical snapshots? This logic 
is simple
and code is clean, but I doubt if there is real users of this option.
   
   Generally speaking, taking snapshot periodically is an ordinal usecase
   of enterprise SAN. Of course sheepdog can support this use case. In a
   case of sheepdog, making cron job (e.g. daily) which invokes dog vdi
   snapshot simply enables it.
   
   But if a VDI doesn't have COWed objects, the snapshot will be
   redundant. So I want to add this option.
  
  Okay, your patch makes sense for periodic snapshot. But if dog have found
  identical snapshots, it won't create a new one and return success to the 
  caller.
  
  I assume the caller is some middleware, if there is no new vdi returned, 
  will
  this cause trouble for it? This means it will need to call 'vdi list' to 
  check
  if new vdi created or not? 
 
 So I'm adding this feature with the new option. Existing semantics
 isn't affected. And if checking process (has_own_objects()) faces
 error, it is reported correctly to middleware.
 
  
  I'm not agasint this patch, but I have some questions. For identical 
  snapshots,
  the overhead is just an inode object created, no? Looks to me the overhead 
  is
  quite small and no need a special option to remove it.
 
 Taking snapshots of thousands of VDIs will consume thousands of VID,
 and create thousands * replication factor of inodes. I'm not sure the
 consumption of VID will become serious problem, but inodes will make
 replication time longer (e.g. 16:4 ec requires 20 inodes).

Yes, this is the point. Make sense to me. I have some comments to the code in
my last email. Could you submit a V2? BTW, it would be great if you can include
above rationale into the commit log.

Thanks
Yuan
-- 
sheepdog mailing list
sheepdog@lists.wpkg.org
https://lists.wpkg.org/mailman/listinfo/sheepdog


Re: [sheepdog] [PATCH 1/2] dog: add a new option for reducing identical snapshots

2015-02-11 Thread Liu Yuan
On Mon, Feb 09, 2015 at 05:25:48PM +0900, Hitoshi Mitake wrote:
 Current dog vdi snapshot command creates a new snapshot
 unconditionally, even if a working VDI doesn't have its own
 objects. In such a case, the created snapshot is redundant because
 same VDI is already existing.

What kind of use case will create two identical snapshots? This logic is simple
and code is clean, but I doubt if there is real users of this option.

 
 This patch adds a new option -R to the dog command for reducing
 the identical snapshots.
 
 Signed-off-by: Hitoshi Mitake mitake.hito...@lab.ntt.co.jp
 ---
  dog/vdi.c | 48 +++-
  1 file changed, 47 insertions(+), 1 deletion(-)
 
 diff --git a/dog/vdi.c b/dog/vdi.c
 index 8e612af..ee465c2 100644
 --- a/dog/vdi.c
 +++ b/dog/vdi.c
 @@ -40,6 +40,8 @@ static struct sd_option vdi_options[] = {
  neither comparing nor repairing},
   {'z', block_size_shift, true, specify the bit shift num for
   data object size},
 + {'R', reduce-identical-snapshots, false, do not create snapshot if 
 +  working VDI doesn't have its own objects},
   { 0, NULL, false, NULL },
  };
  
 @@ -61,6 +63,7 @@ static struct vdi_cmd_data {
   uint64_t oid;
   bool no_share;
   bool exist;
 + bool reduce_identical_snapshots;
  } vdi_cmd_data = { ~0, };
  
  struct get_vdi_info {
 @@ -605,6 +608,31 @@ fail:
   return NULL;
  }
  
 +static bool has_own_objects(uint32_t vid, int *ret)

Traditionally, we'll have functions return SD_RES_xxx because in this way we
could propragate the ret to upper callers.

So it is better to have has_own_objects return SD_RES_xxx for consistency.

Thanks,
Yuan
-- 
sheepdog mailing list
sheepdog@lists.wpkg.org
https://lists.wpkg.org/mailman/listinfo/sheepdog


Re: [sheepdog] [PATCH 1/2] dog: add a new option for reducing identical snapshots

2015-02-11 Thread Hitoshi Mitake
At Thu, 12 Feb 2015 15:31:15 +0800,
Liu Yuan wrote:
 
 On Thu, Feb 12, 2015 at 03:59:51PM +0900, Hitoshi Mitake wrote:
  At Thu, 12 Feb 2015 14:38:37 +0800,
  Liu Yuan wrote:
   
   On Mon, Feb 09, 2015 at 05:25:48PM +0900, Hitoshi Mitake wrote:
Current dog vdi snapshot command creates a new snapshot
unconditionally, even if a working VDI doesn't have its own
objects. In such a case, the created snapshot is redundant because
same VDI is already existing.
   
   What kind of use case will create two identical snapshots? This logic is 
   simple
   and code is clean, but I doubt if there is real users of this option.
  
  Generally speaking, taking snapshot periodically is an ordinal usecase
  of enterprise SAN. Of course sheepdog can support this use case. In a
  case of sheepdog, making cron job (e.g. daily) which invokes dog vdi
  snapshot simply enables it.
  
  But if a VDI doesn't have COWed objects, the snapshot will be
  redundant. So I want to add this option.
 
 Okay, your patch makes sense for periodic snapshot. But if dog have found
 identical snapshots, it won't create a new one and return success to the 
 caller.
 
 I assume the caller is some middleware, if there is no new vdi returned, will
 this cause trouble for it? This means it will need to call 'vdi list' to check
 if new vdi created or not? 

So I'm adding this feature with the new option. Existing semantics
isn't affected. And if checking process (has_own_objects()) faces
error, it is reported correctly to middleware.

 
 I'm not agasint this patch, but I have some questions. For identical 
 snapshots,
 the overhead is just an inode object created, no? Looks to me the overhead is
 quite small and no need a special option to remove it.

Taking snapshots of thousands of VDIs will consume thousands of VID,
and create thousands * replication factor of inodes. I'm not sure the
consumption of VID will become serious problem, but inodes will make
replication time longer (e.g. 16:4 ec requires 20 inodes).

Thanks,
Hitoshi

 
 Thanks
 Yuan
 -- 
 sheepdog mailing list
 sheepdog@lists.wpkg.org
 https://lists.wpkg.org/mailman/listinfo/sheepdog
-- 
sheepdog mailing list
sheepdog@lists.wpkg.org
https://lists.wpkg.org/mailman/listinfo/sheepdog


Re: [sheepdog] [PATCH 1/2] dog: add a new option for reducing identical snapshots

2015-02-11 Thread Hitoshi Mitake
At Thu, 12 Feb 2015 14:38:37 +0800,
Liu Yuan wrote:
 
 On Mon, Feb 09, 2015 at 05:25:48PM +0900, Hitoshi Mitake wrote:
  Current dog vdi snapshot command creates a new snapshot
  unconditionally, even if a working VDI doesn't have its own
  objects. In such a case, the created snapshot is redundant because
  same VDI is already existing.
 
 What kind of use case will create two identical snapshots? This logic is 
 simple
 and code is clean, but I doubt if there is real users of this option.

Generally speaking, taking snapshot periodically is an ordinal usecase
of enterprise SAN. Of course sheepdog can support this use case. In a
case of sheepdog, making cron job (e.g. daily) which invokes dog vdi
snapshot simply enables it.

But if a VDI doesn't have COWed objects, the snapshot will be
redundant. So I want to add this option.

Of course, vdi list will provide information about the COWed object
(used field). But vdi list is heavy operation in a cluster which has
many VDIs because it issues bunch of read requests for inode
headers. So avoiding vdi listing as much as possible is fine.

Thanks,
Hitoshi

 
  
  This patch adds a new option -R to the dog command for reducing
  the identical snapshots.
  
  Signed-off-by: Hitoshi Mitake mitake.hito...@lab.ntt.co.jp
  ---
   dog/vdi.c | 48 +++-
   1 file changed, 47 insertions(+), 1 deletion(-)
  
  diff --git a/dog/vdi.c b/dog/vdi.c
  index 8e612af..ee465c2 100644
  --- a/dog/vdi.c
  +++ b/dog/vdi.c
  @@ -40,6 +40,8 @@ static struct sd_option vdi_options[] = {
 neither comparing nor repairing},
  {'z', block_size_shift, true, specify the bit shift num for
  data object size},
  +   {'R', reduce-identical-snapshots, false, do not create snapshot if 
  +working VDI doesn't have its own objects},
  { 0, NULL, false, NULL },
   };
   
  @@ -61,6 +63,7 @@ static struct vdi_cmd_data {
  uint64_t oid;
  bool no_share;
  bool exist;
  +   bool reduce_identical_snapshots;
   } vdi_cmd_data = { ~0, };
   
   struct get_vdi_info {
  @@ -605,6 +608,31 @@ fail:
  return NULL;
   }
   
  +static bool has_own_objects(uint32_t vid, int *ret)
 
 Traditionally, we'll have functions return SD_RES_xxx because in this way we
 could propragate the ret to upper callers.
 
 So it is better to have has_own_objects return SD_RES_xxx for consistency.
 
 Thanks,
 Yuan
 -- 
 sheepdog mailing list
 sheepdog@lists.wpkg.org
 https://lists.wpkg.org/mailman/listinfo/sheepdog
-- 
sheepdog mailing list
sheepdog@lists.wpkg.org
https://lists.wpkg.org/mailman/listinfo/sheepdog


Re: [sheepdog] [PATCH 1/2] dog: add a new option for reducing identical snapshots

2015-02-11 Thread Liu Yuan
On Thu, Feb 12, 2015 at 03:59:51PM +0900, Hitoshi Mitake wrote:
 At Thu, 12 Feb 2015 14:38:37 +0800,
 Liu Yuan wrote:
  
  On Mon, Feb 09, 2015 at 05:25:48PM +0900, Hitoshi Mitake wrote:
   Current dog vdi snapshot command creates a new snapshot
   unconditionally, even if a working VDI doesn't have its own
   objects. In such a case, the created snapshot is redundant because
   same VDI is already existing.
  
  What kind of use case will create two identical snapshots? This logic is 
  simple
  and code is clean, but I doubt if there is real users of this option.
 
 Generally speaking, taking snapshot periodically is an ordinal usecase
 of enterprise SAN. Of course sheepdog can support this use case. In a
 case of sheepdog, making cron job (e.g. daily) which invokes dog vdi
 snapshot simply enables it.
 
 But if a VDI doesn't have COWed objects, the snapshot will be
 redundant. So I want to add this option.

Okay, your patch makes sense for periodic snapshot. But if dog have found
identical snapshots, it won't create a new one and return success to the caller.

I assume the caller is some middleware, if there is no new vdi returned, will
this cause trouble for it? This means it will need to call 'vdi list' to check
if new vdi created or not? 

I'm not agasint this patch, but I have some questions. For identical snapshots,
the overhead is just an inode object created, no? Looks to me the overhead is
quite small and no need a special option to remove it.

Thanks
Yuan
-- 
sheepdog mailing list
sheepdog@lists.wpkg.org
https://lists.wpkg.org/mailman/listinfo/sheepdog


[sheepdog] [PATCH 1/2] dog: add a new option for reducing identical snapshots

2015-02-09 Thread Hitoshi Mitake
Current dog vdi snapshot command creates a new snapshot
unconditionally, even if a working VDI doesn't have its own
objects. In such a case, the created snapshot is redundant because
same VDI is already existing.

This patch adds a new option -R to the dog command for reducing
the identical snapshots.

Signed-off-by: Hitoshi Mitake mitake.hito...@lab.ntt.co.jp
---
 dog/vdi.c | 48 +++-
 1 file changed, 47 insertions(+), 1 deletion(-)

diff --git a/dog/vdi.c b/dog/vdi.c
index 8e612af..ee465c2 100644
--- a/dog/vdi.c
+++ b/dog/vdi.c
@@ -40,6 +40,8 @@ static struct sd_option vdi_options[] = {
   neither comparing nor repairing},
{'z', block_size_shift, true, specify the bit shift num for
data object size},
+   {'R', reduce-identical-snapshots, false, do not create snapshot if 
+working VDI doesn't have its own objects},
{ 0, NULL, false, NULL },
 };
 
@@ -61,6 +63,7 @@ static struct vdi_cmd_data {
uint64_t oid;
bool no_share;
bool exist;
+   bool reduce_identical_snapshots;
 } vdi_cmd_data = { ~0, };
 
 struct get_vdi_info {
@@ -605,6 +608,31 @@ fail:
return NULL;
 }
 
+static bool has_own_objects(uint32_t vid, int *ret)
+{
+   struct sd_inode *inode;
+   bool result = true;
+
+   inode = xzalloc(sizeof(*inode));
+
+   *ret = dog_read_object(vid_to_vdi_oid(vid), inode,
+ sizeof(*inode), 0, true);
+   if (*ret != SD_RES_SUCCESS)
+   goto out;
+
+   for (int i = 0; i  SD_INODE_DATA_INDEX; i++) {
+   if (inode-data_vdi_id[i]  inode-data_vdi_id[i] == vid)
+   /* VDI has its own object */
+   goto out;
+   }
+
+   result = false;
+
+out:
+   free(inode);
+   return result;
+}
+
 static int vdi_snapshot(int argc, char **argv)
 {
const char *vdiname = argv[optind++];
@@ -703,6 +731,22 @@ static int vdi_snapshot(int argc, char **argv)
nr_issued_prevent_inode_update++;
}
 
+   if (vdi_cmd_data.reduce_identical_snapshots) {
+   bool result = has_own_objects(vid, ret);
+
+   if (ret != SD_RES_SUCCESS)
+   goto out;
+
+   if (!result) {
+   if (verbose)
+   sd_info(VDI %s doesn't have its own objects, 
+   skipping creation of snapshot,
+   vdiname);
+
+   goto out;
+   }
+   }
+
ret = dog_write_object(vid_to_vdi_oid(vid), 0,
   vdi_cmd_data.snapshot_tag,
   SD_MAX_VDI_TAG_LEN,
@@ -3057,7 +3101,7 @@ static struct subcommand vdi_cmd[] = {
{create, vdiname size, PycaphrvzT, create an image,
 NULL, CMD_NEED_NODELIST|CMD_NEED_ARG,
 vdi_create, vdi_options},
-   {snapshot, vdiname, saphrvT, create a snapshot,
+   {snapshot, vdiname, saphrvTR, create a snapshot,
 NULL, CMD_NEED_ARG,
 vdi_snapshot, vdi_options},
{clone, src vdi dst vdi, sPnaphrvT, clone an image,
@@ -3222,6 +3266,8 @@ static int vdi_parser(int ch, const char *opt)
}
vdi_cmd_data.block_size_shift = block_size_shift;
break;
+   case 'R':
+   vdi_cmd_data.reduce_identical_snapshots = true;
}
 
return 0;
-- 
1.9.1

-- 
sheepdog mailing list
sheepdog@lists.wpkg.org
https://lists.wpkg.org/mailman/listinfo/sheepdog