[ 
https://issues.apache.org/jira/browse/HBASE-21071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16584741#comment-16584741
 ] 

Mingliang Liu edited comment on HBASE-21071 at 8/19/18 6:53 AM:
----------------------------------------------------------------

{quote}
It seems the Builder can be a class within MiniClusterOptions.
{quote}
Thanks [~yuzhih...@gmail.com]. Yes that's a good suggestion.

{quote}
Should it be StartMiniClusterOptions to tie the new Builder tighter to the 
startMiniCluster method? (Will other methods in MiniCluster want to take 
options?)
{quote}
Yes this should be {{StartMiniClusterOptions}}. I don't find other methods that 
will be using this option class.

{quote}
Does this mean the options should be MiniHBaseClusterOptions and we should 
rename this start method to be startMiniHBaseCluster. Would a 
MiniHBaseClusterBuilder make sense returning a MiniHBaseCluster instance on 
which you called start.
{quote}
I checked this and tried in code for something that only focuses on the 
{{MiniHBaseCluster}}. Also followed the idea of creating a 
{{MiniHBaseClusterBuilder}} class to build a {{MiniHBaseCluster}} directly. 
However, I got two obstacles after a short code walk:
# Option combinations supported by {{startMiniCluster}} also include multiple 
HDFS options. To simplify those polymorphic helper methods, it seems easier to 
consolidate the hbase/hdfs/zk options in one place {{StartMiniClusterOptions}}. 
Specially {{startMiniHBaseCluster}} also accepts this option.
# If we replace {{startMiniCluster()}}, the current build-and-start combo 
methods, with creating {{MiniHBaseCluster}} from builder first and calling 
start(), all call sites (hundreds) will have to update. Meanwhile, 
{{MiniHBaseCluster}} constructor currently initializes the cluster. We will 
have to split it to builder phase and start() phase. Some of the methods are 
using non-static {{HBaseTestingUtility}} methods to prepare directories. As 
[~stack] expects, it will be a very large refactoring patch.

*TL;DR* I take the trade off of being perfect and being affordable change. To 
start a {{MiniCluster}} or {{MiniHBaseCluster}} cluster, we first build an 
immutable option and then pass it to {{startMiniCluster(option)}} or 
{{startMiniHBaseCluster(option)}} methods. If using default option values, we 
can avoid build an option and use the other two methods {{startMiniCluster()}} 
or {{startMiniHBaseCluster()}}. These four methods sever all use cases we are 
targeting in a (hopefully) clear, simple and flexible way.

The v1 patch almost implements this idea, with one exception: 
{{startMiniCluster(int numSlaves)}} and {{startMiniCluster(int numMasters, int 
numSlaves)}} are not yet removed and replaced with simple builder calls. The 
reason is that, there are hundreds of calls of those two methods, and changing 
them manually can be error-prone. I'd like to post the v1 patch first for high 
level review and Jenkins. If it looks good overall, in the next patch I can 
update all other places. I'm fine if we keep one of them as another shortcut by 
the way.

Thanks,


was (Author: liuml07):
{quote}
It seems the Builder can be a class within MiniClusterOptions.
{quote}
Thanks [~yuzhih...@gmail.com]. Yes that's a good suggestion.

{quote}
Should it be StartMiniClusterOptions to tie the new Builder tighter to the 
startMiniCluster method? (Will other methods in MiniCluster want to take 
options?)
{quote}
Yes this should be {{StartMiniClusterOptions}}. I don't find other methods that 
will be using this option class.

{quote}
Does this mean the options should be MiniHBaseClusterOptions and we should 
rename this start method to be startMiniHBaseCluster. Would a 
MiniHBaseClusterBuilder make sense returning a MiniHBaseCluster instance on 
which you called start.
{quote}
I checked this and tried in code for something that only focuses on the 
{{MiniHBaseCluster}}. Also followed the idea of creating a 
{{MiniHBaseClusterBuilder}} class to build a {{MiniHBaseCluster}} directly. 
However, I got two obstacles after a short code walk:
# Option combinations supported by {{startMiniCluster}} also include multiple 
HDFS options. To simplify those polymorphic helper methods, it seems easier to 
consolidate the hbase/hdfs/zk options in one place {{StartMiniClusterOptions}}. 
Specially {{startMiniHBaseCluster}} also accepts this option.
# If we replace {{startMiniCluster()}, the current build-and-start combo 
methods, with creating {{MiniHBaseCluster}} from builder first and calling 
start(), all call sites (hundreds) will have to update. Meanwhile, 
{{MiniHBaseCluster}} constructor currently initializes the cluster. We will 
have to split it to builder phase and start() phase. Some of the methods are 
using non-static {{HBaseTestingUtility}} methods to prepare directories. As 
[~stack] expects, it will be a very large refactoring patch.

*TL;DR* I take the trade off of being perfect and being affordable change. To 
start a {{MiniCluster}} or {{MiniHBaseCluster}} cluster, we first build an 
immutable option and then pass it to {{startMiniCluster(option)}} or 
{{startMiniHBaseCluster(option)}} methods. If using default option values, we 
can avoid build an option and use the other two methods {{startMiniCluster()}} 
or {{startMiniHBaseCluster()}}. These four methods sever all use cases we are 
targeting in a (hopefully) clear, simple and flexible way.

The v1 patch almost implements this idea, with one exception: 
{{startMiniCluster(int numSlaves)}} and {{startMiniCluster(int numMasters, int 
numSlaves)}} are not yet removed and replaced with simple builder calls. The 
reason is that, there are hundreds of calls of those two methods, and changing 
them manually can be error-prone. I'd like to post the v1 patch first for high 
level review and Jenkins. If it looks good overall, in the next patch I can 
update all other places. I'm fine if we keep one of them as another shortcut by 
the way.

Thanks,

> HBaseTestingUtility::startMiniCluster() to use builder pattern
> --------------------------------------------------------------
>
>                 Key: HBASE-21071
>                 URL: https://issues.apache.org/jira/browse/HBASE-21071
>             Project: HBase
>          Issue Type: Bug
>          Components: test
>    Affects Versions: 3.0.0
>            Reporter: Mingliang Liu
>            Assignee: Mingliang Liu
>            Priority: Major
>         Attachments: HBASE-21071.000.patch, HBASE-21071.001.patch
>
>
> Currently there are 13 {{startMiniCluster()}} methods to set up a mini 
> cluster. I'm not surprised if we have a few more in future. It's good to 
> support different combination of optional parameters. We have to pick up one 
> of them carefully while still wondering the default values of other 
> parameters; if we add a new option, we may bring more new methods.
> One solution is to use builder pattern: create a class {{MiniClusterOptions}} 
> along with a static class {{MiniClusterOptionsBuilder}}, create a new method  
> {{startMiniCluster(MiniClusterOptions)}}. In {{master}} we delete the old 13 
> methods while in branch-2, we deprecate the old 13 methods.
> Thoughts?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to