[ 
https://issues.apache.org/jira/browse/GOBBLIN-707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Sen updated GOBBLIN-707:
----------------------------
    Description: 
gobblin supports multiple modes of executions ( CLI, Standalone, 
cluster-master, cluster-worker, AWS, YARN, MR ) and various command lines 
utility to run cli and admin commands. There is a individual script for each of 
them.

Having individual script introduces lot of issues
 # all scripts handles gobblin variables, user parameters differently, and its 
highly inconsistent among various different gobblin scripts
 # functionality around start, stop, status checking and handling PID's among 
lot of other things, varies vastly as per the implementation of the script.
 # features like GC & JVM params, log4j file selection, classpath calculation, 
etc... exists in some gobblin scripts but not all, adding to inconsistent user 
experience.
 # maintaining total 13 script would be too much effort.

Also all the gobblin scripts share lot of common code to handle params, start, 
stop services, status checks, pid handling, etc... combining all the scripts 
into  1 not only makes maintenance easier but also brings clarity and 
consistency.

 

Solution:

1. there can be one gobblin.sh script to handle all gobblin commands and 
deployment options as per following signature. NOTE: This

{{gobblin.sh  <command> <params>}}
 {{gobblin.sh  <execution-mode> <start|stop|status>}}

{{commands values: admin, cli, statestore-check, statestore-clean, 
historystore-manager, classpath}}
 {{service values: standalone, cluster-master, cluster-worker, aws, yarn, mr, 
service}}

with above change, following becomes valid command.
{code:java}
# all under GobblinCli class
gobblin run listQuickApps  –> gobblin cli run listQuickApps
gobblin run listQuickApps  –> gobblin cli run listQuickApps
gobblin run <quick-app-name> -> gobblin cli run <quick-app-name>

# class: JobStateToJsonConverter
statestore-checker.sh <args> -> gobblin statestore-checker <args>

# class: StateStoreCleaner
statestore-clean.sh <args> -> gobblin statestore-clean <args>

# class: DatabaseJobHistoryStoreSchemaManager
historystore-manager.sh <args> -> gobblin historystore-manager <args>

# class: Cli
gobblin-admin.sh <args>   -> gobblin admin <args>

# all gobblin deployment modes
gobblin-cluster-master.sh   -> gobblin cluster-mater start|stop|status
gobblin-cluster-worker.sh   -> gobblin cluster-mater start|stop|status
gobblin-compaction.sh       -> gobblin cluster-mater start|stop|status
gobblin-env.sh              -> gobblin cluster-mater start|stop|status
gobblin-mapreduce.sh        -> gobblin cluster-mater start|stop|status
gobblin-service.sh          -> gobblin cluster-mater start|stop|status
gobblin-standalone.sh       -> gobblin cluster-mater start|stop|status
gobblin-yarn.sh             -> gobblin cluster-mater start|stop|status
{code}
 

2. Also configs needs to be structured and deduped accordingly.

 
{color:red}
    NOTE: this refactoring to gobblin.sh, changes the way all gobblin commands 
where ran before
{color}

  was:
gobblin supports multiple modes of executions ( CLI, Standalone, 
cluster-master, cluster-worker, AWS, YARN, MR ) and there is a individual 
script for each of them.

Having individual script introduces lot of issues
 # all scripts handles gobblin variables, user parameters differently, and its 
highly inconsistent among various different gobblin scripts
 # functionality around start, stop, status checking and handling PID's among 
lot of other things, varies vastly as per the implementation of the script.
 # features like GC & JVM params, log4j file selection, classpath calculation, 
etc... exists in some gobblin scripts but not all, adding to inconsistent user 
experience.
 # maintaining total 13 script would be too much effort.

Also all the gobblin scripts share lot of common code to handle params, start, 
stop services, status checks, pid handling, etc... combining all the scripts 
into  1 not only makes maintenance easier but also brings clarity and 
consistency.

 

Solution:

1. there can be one gobblin.sh script to handle all gobblin commands and 
deployment options as per following signature. NOTE: This

{{gobblin.sh  <command> <params>}}
 {{gobblin.sh  <execution-mode> <start|stop|status>}}

{{commands values: admin, cli, statestore-check, statestore-clean, 
historystore-manager, classpath}}
 {{service values: standalone, cluster-master, cluster-worker, aws, yarn, mr, 
service}}

with above change, following becomes valid command.
{code:java}
# all under GobblinCli class
gobblin run listQuickApps  –> gobblin cli run listQuickApps
gobblin run listQuickApps  –> gobblin cli run listQuickApps
gobblin run <quick-app-name> -> gobblin cli run <quick-app-name>

# class: JobStateToJsonConverter
statestore-checker.sh <args> -> gobblin statestore-checker <args>

# class: StateStoreCleaner
statestore-clean.sh <args> -> gobblin statestore-clean <args>

# class: DatabaseJobHistoryStoreSchemaManager
historystore-manager.sh <args> -> gobblin historystore-manager <args>

# class: Cli
gobblin-admin.sh <args>   -> gobblin admin <args>

# all gobblin deployment modes
gobblin-cluster-master.sh   -> gobblin cluster-mater start|stop|status
gobblin-cluster-worker.sh   -> gobblin cluster-mater start|stop|status
gobblin-compaction.sh       -> gobblin cluster-mater start|stop|status
gobblin-env.sh              -> gobblin cluster-mater start|stop|status
gobblin-mapreduce.sh        -> gobblin cluster-mater start|stop|status
gobblin-service.sh          -> gobblin cluster-mater start|stop|status
gobblin-standalone.sh       -> gobblin cluster-mater start|stop|status
gobblin-yarn.sh             -> gobblin cluster-mater start|stop|status
{code}
 

2. Also configs needs to be structured and deduped accordingly.

 
{color:red}
    NOTE: this refactoring to gobblin.sh, changes the way all gobblin commands 
where ran before
{color}


> combine & standardize all gobblin scripts into one master script & 
> restructure configs accordingly
> --------------------------------------------------------------------------------------------------
>
>                 Key: GOBBLIN-707
>                 URL: https://issues.apache.org/jira/browse/GOBBLIN-707
>             Project: Apache Gobblin
>          Issue Type: Improvement
>            Reporter: Jay Sen
>            Priority: Major
>          Time Spent: 5h
>  Remaining Estimate: 0h
>
> gobblin supports multiple modes of executions ( CLI, Standalone, 
> cluster-master, cluster-worker, AWS, YARN, MR ) and various command lines 
> utility to run cli and admin commands. There is a individual script for each 
> of them.
> Having individual script introduces lot of issues
>  # all scripts handles gobblin variables, user parameters differently, and 
> its highly inconsistent among various different gobblin scripts
>  # functionality around start, stop, status checking and handling PID's among 
> lot of other things, varies vastly as per the implementation of the script.
>  # features like GC & JVM params, log4j file selection, classpath 
> calculation, etc... exists in some gobblin scripts but not all, adding to 
> inconsistent user experience.
>  # maintaining total 13 script would be too much effort.
> Also all the gobblin scripts share lot of common code to handle params, 
> start, stop services, status checks, pid handling, etc... combining all the 
> scripts into  1 not only makes maintenance easier but also brings clarity and 
> consistency.
>  
> Solution:
> 1. there can be one gobblin.sh script to handle all gobblin commands and 
> deployment options as per following signature. NOTE: This
> {{gobblin.sh  <command> <params>}}
>  {{gobblin.sh  <execution-mode> <start|stop|status>}}
> {{commands values: admin, cli, statestore-check, statestore-clean, 
> historystore-manager, classpath}}
>  {{service values: standalone, cluster-master, cluster-worker, aws, yarn, mr, 
> service}}
> with above change, following becomes valid command.
> {code:java}
> # all under GobblinCli class
> gobblin run listQuickApps  –> gobblin cli run listQuickApps
> gobblin run listQuickApps  –> gobblin cli run listQuickApps
> gobblin run <quick-app-name> -> gobblin cli run <quick-app-name>
> # class: JobStateToJsonConverter
> statestore-checker.sh <args> -> gobblin statestore-checker <args>
> # class: StateStoreCleaner
> statestore-clean.sh <args> -> gobblin statestore-clean <args>
> # class: DatabaseJobHistoryStoreSchemaManager
> historystore-manager.sh <args> -> gobblin historystore-manager <args>
> # class: Cli
> gobblin-admin.sh <args>   -> gobblin admin <args>
> # all gobblin deployment modes
> gobblin-cluster-master.sh   -> gobblin cluster-mater start|stop|status
> gobblin-cluster-worker.sh   -> gobblin cluster-mater start|stop|status
> gobblin-compaction.sh       -> gobblin cluster-mater start|stop|status
> gobblin-env.sh              -> gobblin cluster-mater start|stop|status
> gobblin-mapreduce.sh        -> gobblin cluster-mater start|stop|status
> gobblin-service.sh          -> gobblin cluster-mater start|stop|status
> gobblin-standalone.sh       -> gobblin cluster-mater start|stop|status
> gobblin-yarn.sh             -> gobblin cluster-mater start|stop|status
> {code}
>  
> 2. Also configs needs to be structured and deduped accordingly.
>  
> {color:red}
>     NOTE: this refactoring to gobblin.sh, changes the way all gobblin 
> commands where ran before
> {color}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to