Re: [sheepdog] [PATCH 1/2] dog: add a new option for reducing identical snapshots

2015-02-12 Thread Hitoshi Mitake
At Thu, 12 Feb 2015 15:49:00 +0800,
Liu Yuan wrote:
> 
> On Thu, Feb 12, 2015 at 04:40:56PM +0900, Hitoshi Mitake wrote:
> > At Thu, 12 Feb 2015 15:31:15 +0800,
> > Liu Yuan wrote:
> > > 
> > > On Thu, Feb 12, 2015 at 03:59:51PM +0900, Hitoshi Mitake wrote:
> > > > At Thu, 12 Feb 2015 14:38:37 +0800,
> > > > Liu Yuan wrote:
> > > > > 
> > > > > On Mon, Feb 09, 2015 at 05:25:48PM +0900, Hitoshi Mitake wrote:
> > > > > > Current "dog vdi snapshot" command creates a new snapshot
> > > > > > unconditionally, even if a working VDI doesn't have its own
> > > > > > objects. In such a case, the created snapshot is redundant because
> > > > > > same VDI is already existing.
> > > > > 
> > > > > What kind of use case will create two identical snapshots? This logic 
> > > > > is simple
> > > > > and code is clean, but I doubt if there is real users of this option.
> > > > 
> > > > Generally speaking, taking snapshot periodically is an ordinal usecase
> > > > of enterprise SAN. Of course sheepdog can support this use case. In a
> > > > case of sheepdog, making cron job (e.g. daily) which invokes "dog vdi
> > > > snapshot" simply enables it.
> > > > 
> > > > But if a VDI doesn't have COWed objects, the snapshot will be
> > > > redundant. So I want to add this option.
> > > 
> > > Okay, your patch makes sense for periodic snapshot. But if dog have found
> > > identical snapshots, it won't create a new one and return success to the 
> > > caller.
> > > 
> > > I assume the caller is some middleware, if there is no new vdi returned, 
> > > will
> > > this cause trouble for it? This means it will need to call 'vdi list' to 
> > > check
> > > if new vdi created or not? 
> > 
> > So I'm adding this feature with the new option. Existing semantics
> > isn't affected. And if checking process (has_own_objects()) faces
> > error, it is reported correctly to middleware.
> > 
> > > 
> > > I'm not agasint this patch, but I have some questions. For identical 
> > > snapshots,
> > > the overhead is just an inode object created, no? Looks to me the 
> > > overhead is
> > > quite small and no need a special option to remove it.
> > 
> > Taking snapshots of thousands of VDIs will consume thousands of VID,
> > and create thousands * replication factor of inodes. I'm not sure the
> > consumption of VID will become serious problem, but inodes will make
> > replication time longer (e.g. 16:4 ec requires 20 inodes).
> 
> Yes, this is the point. Make sense to me. I have some comments to the code in
> my last email. Could you submit a V2? BTW, it would be great if you can 
> include
> above rationale into the commit log.

OK, I'll send v2 later.

Thanks,
Hitoshi

> 
> Thanks
> Yuan
> -- 
> sheepdog mailing list
> sheepdog@lists.wpkg.org
> https://lists.wpkg.org/mailman/listinfo/sheepdog
-- 
sheepdog mailing list
sheepdog@lists.wpkg.org
https://lists.wpkg.org/mailman/listinfo/sheepdog


Re: [sheepdog] [PATCH 1/2] dog: add a new option for reducing identical snapshots

2015-02-11 Thread Liu Yuan
On Thu, Feb 12, 2015 at 04:40:56PM +0900, Hitoshi Mitake wrote:
> At Thu, 12 Feb 2015 15:31:15 +0800,
> Liu Yuan wrote:
> > 
> > On Thu, Feb 12, 2015 at 03:59:51PM +0900, Hitoshi Mitake wrote:
> > > At Thu, 12 Feb 2015 14:38:37 +0800,
> > > Liu Yuan wrote:
> > > > 
> > > > On Mon, Feb 09, 2015 at 05:25:48PM +0900, Hitoshi Mitake wrote:
> > > > > Current "dog vdi snapshot" command creates a new snapshot
> > > > > unconditionally, even if a working VDI doesn't have its own
> > > > > objects. In such a case, the created snapshot is redundant because
> > > > > same VDI is already existing.
> > > > 
> > > > What kind of use case will create two identical snapshots? This logic 
> > > > is simple
> > > > and code is clean, but I doubt if there is real users of this option.
> > > 
> > > Generally speaking, taking snapshot periodically is an ordinal usecase
> > > of enterprise SAN. Of course sheepdog can support this use case. In a
> > > case of sheepdog, making cron job (e.g. daily) which invokes "dog vdi
> > > snapshot" simply enables it.
> > > 
> > > But if a VDI doesn't have COWed objects, the snapshot will be
> > > redundant. So I want to add this option.
> > 
> > Okay, your patch makes sense for periodic snapshot. But if dog have found
> > identical snapshots, it won't create a new one and return success to the 
> > caller.
> > 
> > I assume the caller is some middleware, if there is no new vdi returned, 
> > will
> > this cause trouble for it? This means it will need to call 'vdi list' to 
> > check
> > if new vdi created or not? 
> 
> So I'm adding this feature with the new option. Existing semantics
> isn't affected. And if checking process (has_own_objects()) faces
> error, it is reported correctly to middleware.
> 
> > 
> > I'm not agasint this patch, but I have some questions. For identical 
> > snapshots,
> > the overhead is just an inode object created, no? Looks to me the overhead 
> > is
> > quite small and no need a special option to remove it.
> 
> Taking snapshots of thousands of VDIs will consume thousands of VID,
> and create thousands * replication factor of inodes. I'm not sure the
> consumption of VID will become serious problem, but inodes will make
> replication time longer (e.g. 16:4 ec requires 20 inodes).

Yes, this is the point. Make sense to me. I have some comments to the code in
my last email. Could you submit a V2? BTW, it would be great if you can include
above rationale into the commit log.

Thanks
Yuan
-- 
sheepdog mailing list
sheepdog@lists.wpkg.org
https://lists.wpkg.org/mailman/listinfo/sheepdog


Re: [sheepdog] [PATCH 1/2] dog: add a new option for reducing identical snapshots

2015-02-11 Thread Hitoshi Mitake
At Thu, 12 Feb 2015 15:31:15 +0800,
Liu Yuan wrote:
> 
> On Thu, Feb 12, 2015 at 03:59:51PM +0900, Hitoshi Mitake wrote:
> > At Thu, 12 Feb 2015 14:38:37 +0800,
> > Liu Yuan wrote:
> > > 
> > > On Mon, Feb 09, 2015 at 05:25:48PM +0900, Hitoshi Mitake wrote:
> > > > Current "dog vdi snapshot" command creates a new snapshot
> > > > unconditionally, even if a working VDI doesn't have its own
> > > > objects. In such a case, the created snapshot is redundant because
> > > > same VDI is already existing.
> > > 
> > > What kind of use case will create two identical snapshots? This logic is 
> > > simple
> > > and code is clean, but I doubt if there is real users of this option.
> > 
> > Generally speaking, taking snapshot periodically is an ordinal usecase
> > of enterprise SAN. Of course sheepdog can support this use case. In a
> > case of sheepdog, making cron job (e.g. daily) which invokes "dog vdi
> > snapshot" simply enables it.
> > 
> > But if a VDI doesn't have COWed objects, the snapshot will be
> > redundant. So I want to add this option.
> 
> Okay, your patch makes sense for periodic snapshot. But if dog have found
> identical snapshots, it won't create a new one and return success to the 
> caller.
> 
> I assume the caller is some middleware, if there is no new vdi returned, will
> this cause trouble for it? This means it will need to call 'vdi list' to check
> if new vdi created or not? 

So I'm adding this feature with the new option. Existing semantics
isn't affected. And if checking process (has_own_objects()) faces
error, it is reported correctly to middleware.

> 
> I'm not agasint this patch, but I have some questions. For identical 
> snapshots,
> the overhead is just an inode object created, no? Looks to me the overhead is
> quite small and no need a special option to remove it.

Taking snapshots of thousands of VDIs will consume thousands of VID,
and create thousands * replication factor of inodes. I'm not sure the
consumption of VID will become serious problem, but inodes will make
replication time longer (e.g. 16:4 ec requires 20 inodes).

Thanks,
Hitoshi

> 
> Thanks
> Yuan
> -- 
> sheepdog mailing list
> sheepdog@lists.wpkg.org
> https://lists.wpkg.org/mailman/listinfo/sheepdog
-- 
sheepdog mailing list
sheepdog@lists.wpkg.org
https://lists.wpkg.org/mailman/listinfo/sheepdog


Re: [sheepdog] [PATCH 1/2] dog: add a new option for reducing identical snapshots

2015-02-11 Thread Liu Yuan
On Thu, Feb 12, 2015 at 03:59:51PM +0900, Hitoshi Mitake wrote:
> At Thu, 12 Feb 2015 14:38:37 +0800,
> Liu Yuan wrote:
> > 
> > On Mon, Feb 09, 2015 at 05:25:48PM +0900, Hitoshi Mitake wrote:
> > > Current "dog vdi snapshot" command creates a new snapshot
> > > unconditionally, even if a working VDI doesn't have its own
> > > objects. In such a case, the created snapshot is redundant because
> > > same VDI is already existing.
> > 
> > What kind of use case will create two identical snapshots? This logic is 
> > simple
> > and code is clean, but I doubt if there is real users of this option.
> 
> Generally speaking, taking snapshot periodically is an ordinal usecase
> of enterprise SAN. Of course sheepdog can support this use case. In a
> case of sheepdog, making cron job (e.g. daily) which invokes "dog vdi
> snapshot" simply enables it.
> 
> But if a VDI doesn't have COWed objects, the snapshot will be
> redundant. So I want to add this option.

Okay, your patch makes sense for periodic snapshot. But if dog have found
identical snapshots, it won't create a new one and return success to the caller.

I assume the caller is some middleware, if there is no new vdi returned, will
this cause trouble for it? This means it will need to call 'vdi list' to check
if new vdi created or not? 

I'm not agasint this patch, but I have some questions. For identical snapshots,
the overhead is just an inode object created, no? Looks to me the overhead is
quite small and no need a special option to remove it.

Thanks
Yuan
-- 
sheepdog mailing list
sheepdog@lists.wpkg.org
https://lists.wpkg.org/mailman/listinfo/sheepdog


Re: [sheepdog] [PATCH 1/2] dog: add a new option for reducing identical snapshots

2015-02-11 Thread Hitoshi Mitake
At Thu, 12 Feb 2015 14:38:37 +0800,
Liu Yuan wrote:
> 
> On Mon, Feb 09, 2015 at 05:25:48PM +0900, Hitoshi Mitake wrote:
> > Current "dog vdi snapshot" command creates a new snapshot
> > unconditionally, even if a working VDI doesn't have its own
> > objects. In such a case, the created snapshot is redundant because
> > same VDI is already existing.
> 
> What kind of use case will create two identical snapshots? This logic is 
> simple
> and code is clean, but I doubt if there is real users of this option.

Generally speaking, taking snapshot periodically is an ordinal usecase
of enterprise SAN. Of course sheepdog can support this use case. In a
case of sheepdog, making cron job (e.g. daily) which invokes "dog vdi
snapshot" simply enables it.

But if a VDI doesn't have COWed objects, the snapshot will be
redundant. So I want to add this option.

Of course, vdi list will provide information about the COWed object
(used field). But vdi list is heavy operation in a cluster which has
many VDIs because it issues bunch of read requests for inode
headers. So avoiding vdi listing as much as possible is fine.

Thanks,
Hitoshi

> 
> > 
> > This patch adds a new option -R to the dog command for reducing
> > the identical snapshots.
> > 
> > Signed-off-by: Hitoshi Mitake 
> > ---
> >  dog/vdi.c | 48 +++-
> >  1 file changed, 47 insertions(+), 1 deletion(-)
> > 
> > diff --git a/dog/vdi.c b/dog/vdi.c
> > index 8e612af..ee465c2 100644
> > --- a/dog/vdi.c
> > +++ b/dog/vdi.c
> > @@ -40,6 +40,8 @@ static struct sd_option vdi_options[] = {
> >  "  neither comparing nor repairing"},
> > {'z', "block_size_shift", true, "specify the bit shift num for"
> >" data object size"},
> > +   {'R', "reduce-identical-snapshots", false, "do not create snapshot if "
> > +"working VDI doesn't have its own objects"},
> > { 0, NULL, false, NULL },
> >  };
> >  
> > @@ -61,6 +63,7 @@ static struct vdi_cmd_data {
> > uint64_t oid;
> > bool no_share;
> > bool exist;
> > +   bool reduce_identical_snapshots;
> >  } vdi_cmd_data = { ~0, };
> >  
> >  struct get_vdi_info {
> > @@ -605,6 +608,31 @@ fail:
> > return NULL;
> >  }
> >  
> > +static bool has_own_objects(uint32_t vid, int *ret)
> 
> Traditionally, we'll have functions return SD_RES_xxx because in this way we
> could propragate the ret to upper callers.
> 
> So it is better to have has_own_objects return SD_RES_xxx for consistency.
> 
> Thanks,
> Yuan
> -- 
> sheepdog mailing list
> sheepdog@lists.wpkg.org
> https://lists.wpkg.org/mailman/listinfo/sheepdog
-- 
sheepdog mailing list
sheepdog@lists.wpkg.org
https://lists.wpkg.org/mailman/listinfo/sheepdog


Re: [sheepdog] [PATCH 1/2] dog: add a new option for reducing identical snapshots

2015-02-11 Thread Liu Yuan
On Mon, Feb 09, 2015 at 05:25:48PM +0900, Hitoshi Mitake wrote:
> Current "dog vdi snapshot" command creates a new snapshot
> unconditionally, even if a working VDI doesn't have its own
> objects. In such a case, the created snapshot is redundant because
> same VDI is already existing.

What kind of use case will create two identical snapshots? This logic is simple
and code is clean, but I doubt if there is real users of this option.

> 
> This patch adds a new option -R to the dog command for reducing
> the identical snapshots.
> 
> Signed-off-by: Hitoshi Mitake 
> ---
>  dog/vdi.c | 48 +++-
>  1 file changed, 47 insertions(+), 1 deletion(-)
> 
> diff --git a/dog/vdi.c b/dog/vdi.c
> index 8e612af..ee465c2 100644
> --- a/dog/vdi.c
> +++ b/dog/vdi.c
> @@ -40,6 +40,8 @@ static struct sd_option vdi_options[] = {
>"  neither comparing nor repairing"},
>   {'z', "block_size_shift", true, "specify the bit shift num for"
>  " data object size"},
> + {'R', "reduce-identical-snapshots", false, "do not create snapshot if "
> +  "working VDI doesn't have its own objects"},
>   { 0, NULL, false, NULL },
>  };
>  
> @@ -61,6 +63,7 @@ static struct vdi_cmd_data {
>   uint64_t oid;
>   bool no_share;
>   bool exist;
> + bool reduce_identical_snapshots;
>  } vdi_cmd_data = { ~0, };
>  
>  struct get_vdi_info {
> @@ -605,6 +608,31 @@ fail:
>   return NULL;
>  }
>  
> +static bool has_own_objects(uint32_t vid, int *ret)

Traditionally, we'll have functions return SD_RES_xxx because in this way we
could propragate the ret to upper callers.

So it is better to have has_own_objects return SD_RES_xxx for consistency.

Thanks,
Yuan
-- 
sheepdog mailing list
sheepdog@lists.wpkg.org
https://lists.wpkg.org/mailman/listinfo/sheepdog