[ 
https://issues.apache.org/jira/browse/GOBBLIN-707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Sen updated GOBBLIN-707:
----------------------------
    Description: 
gobblin supports multiple modes of executions ( CLI, Standalone, 
cluster-master, cluster-worker, AWS, YARN, MR ) and various command lines 
utility to run cli and admin commands. The problem is each cli and execution 
mode has individual script to manage the service, which brings following 
problems.

Having individual script introduces lot of issues
 # all scripts handles gobblin variables, user parameters differently, and its 
highly inconsistent among various different gobblin scripts, not to mention 
different features supported by different scripts.
 # functionality around start, stop, status checking and handling PID's among 
lot of other things, varies vastly as per the implementation of the script.
 # features like GC & JVM params, log4j file selection, classpath calculation, 
etc... exists in some gobblin scripts but not all, adding to inconsistent user 
experience.
# code duplication: all the gobblin scripts share lot of common code to handle 
params, start, stop services, status checks, pid handling, etc... combining all 
the scripts into 1 not only makes maintenance easier but also brings clarity 
and consistency. 
# Basically, current 13 different scripts adds confusion to new user on how to 
use Gobblin or how to use it.


Solution:

1. there can be one gobblin.sh script to handle all gobblin commands and 
deployment options as per following signature. NOTE: This

{{gobblin.sh  <command> <params>}}
 {{gobblin.sh  <execution-mode> <start|stop|status>}}

{{commands values: admin, cli, statestore-check, statestore-clean, 
historystore-manager, classpath}}
 {{service values: standalone, cluster-master, cluster-worker, aws, yarn, mr, 
service}}

with above change, following becomes valid command.
{code:java}
# all under GobblinCli class
gobblin run listQuickApps  –> gobblin cli run listQuickApps <params>
gobblin run <quick-app-name> -> gobblin cli run <quick-app-name> <params>

# class: JobStateToJsonConverter
statestore-checker.sh <args> -> gobblin cli job-state-to-json <params>

# class: StateStoreCleaner
statestore-clean.sh <args> -> the class is depricated so no need to migrate 
this over.

# class: DatabaseJobHistoryStoreSchemaManager
historystore-manager.sh <args> -> gobblin cli job-store-schema-manager <params>

# class: Cli
gobblin-admin.sh <args>   -> gobblin cli admin <args>

# all gobblin deployment modes
gobblin-cluster-master.sh   -> gobblin service cluster-master start|stop|status
gobblin-cluster-worker.sh   -> gobblin service cluster-worker start|stop|status
gobblin-compaction.sh       -> gobblin-compaction.sh  ( kept as it is for now, 
can be migrated to new script framework)
gobblin-mapreduce.sh        -> gobblin service mapreduce start|stop|status
gobblin-service.sh               -> gobblin service service-manager 
start|stop|status
gobblin-standalone.sh        -> gobblin service standalone start|stop|status
gobblin-yarn.sh                   -> gobblin service yarn start|stop|status
{code}
 

2. Also all configurations for each mode needs to be structured and de-duped 
accordingly to make it clear on which config will be picked up for which 
execution mode. This would be well defined in command help instructions.

 {color:#ff0000}
 NOTE: this refactoring adds all cli and service commands to gobblin.sh and 
hence changes the syntax for all commands and services.{color}

  was:
gobblin supports multiple modes of executions ( CLI, Standalone, 
cluster-master, cluster-worker, AWS, YARN, MR ) and various command lines 
utility to run cli and admin commands. The problem is each cli and execution 
mode has individual script to manage the service, which brings following 
problems.

Having individual script introduces lot of issues
 # all scripts handles gobblin variables, user parameters differently, and its 
highly inconsistent among various different gobblin scripts, not to mention 
different features supported by different scripts.
 # functionality around start, stop, status checking and handling PID's among 
lot of other things, varies vastly as per the implementation of the script.
 # features like GC & JVM params, log4j file selection, classpath calculation, 
etc... exists in some gobblin scripts but not all, adding to inconsistent user 
experience.
# code duplication: all the gobblin scripts share lot of common code to handle 
params, start, stop services, status checks, pid handling, etc... combining all 
the scripts into 1 not only makes maintenance easier but also brings clarity 
and consistency. 
# Basically, current 13 different scripts adds confusion to new user on how to 
use Gobblin or how to use it.


Solution:

1. there can be one gobblin.sh script to handle all gobblin commands and 
deployment options as per following signature. NOTE: This

{{gobblin.sh  <command> <params>}}
 {{gobblin.sh  <execution-mode> <start|stop|status>}}

{{commands values: admin, cli, statestore-check, statestore-clean, 
historystore-manager, classpath}}
 {{service values: standalone, cluster-master, cluster-worker, aws, yarn, mr, 
service}}

with above change, following becomes valid command.
{code:java}
# all under GobblinCli class
gobblin run listQuickApps  –> gobblin cli run listQuickApps
gobblin run listQuickApps  –> gobblin cli run listQuickApps
gobblin run <quick-app-name> -> gobblin cli run <quick-app-name>

# class: JobStateToJsonConverter
statestore-checker.sh <args> -> gobblin statestore-checker <args>

# class: StateStoreCleaner
statestore-clean.sh <args> -> gobblin statestore-clean <args>

# class: DatabaseJobHistoryStoreSchemaManager
historystore-manager.sh <args> -> gobblin historystore-manager <args>

# class: Cli
gobblin-admin.sh <args>   -> gobblin admin <args>

# all gobblin deployment modes
gobblin-cluster-master.sh   -> gobblin cluster-mater start|stop|status
gobblin-cluster-worker.sh   -> gobblin cluster-mater start|stop|status
gobblin-compaction.sh       -> gobblin cluster-mater start|stop|status
gobblin-env.sh              -> gobblin cluster-mater start|stop|status
gobblin-mapreduce.sh        -> gobblin cluster-mater start|stop|status
gobblin-service.sh          -> gobblin cluster-mater start|stop|status
gobblin-standalone.sh       -> gobblin cluster-mater start|stop|status
gobblin-yarn.sh             -> gobblin cluster-mater start|stop|status
{code}
 

2. Also configs needs to be structured and deduped accordingly to make it clear 
on which config will be picked up for which execution mode.

 {color:#ff0000}
 NOTE: this refactoring adds all cli and service commands to gobblin.sh and 
hence changes the syntax for all commands and services.{color}


> combine & standardize all gobblin scripts into one master script & 
> restructure configs accordingly
> --------------------------------------------------------------------------------------------------
>
>                 Key: GOBBLIN-707
>                 URL: https://issues.apache.org/jira/browse/GOBBLIN-707
>             Project: Apache Gobblin
>          Issue Type: Improvement
>            Reporter: Jay Sen
>            Priority: Major
>          Time Spent: 9h 50m
>  Remaining Estimate: 0h
>
> gobblin supports multiple modes of executions ( CLI, Standalone, 
> cluster-master, cluster-worker, AWS, YARN, MR ) and various command lines 
> utility to run cli and admin commands. The problem is each cli and execution 
> mode has individual script to manage the service, which brings following 
> problems.
> Having individual script introduces lot of issues
>  # all scripts handles gobblin variables, user parameters differently, and 
> its highly inconsistent among various different gobblin scripts, not to 
> mention different features supported by different scripts.
>  # functionality around start, stop, status checking and handling PID's among 
> lot of other things, varies vastly as per the implementation of the script.
>  # features like GC & JVM params, log4j file selection, classpath 
> calculation, etc... exists in some gobblin scripts but not all, adding to 
> inconsistent user experience.
> # code duplication: all the gobblin scripts share lot of common code to 
> handle params, start, stop services, status checks, pid handling, etc... 
> combining all the scripts into 1 not only makes maintenance easier but also 
> brings clarity and consistency. 
> # Basically, current 13 different scripts adds confusion to new user on how 
> to use Gobblin or how to use it.
> Solution:
> 1. there can be one gobblin.sh script to handle all gobblin commands and 
> deployment options as per following signature. NOTE: This
> {{gobblin.sh  <command> <params>}}
>  {{gobblin.sh  <execution-mode> <start|stop|status>}}
> {{commands values: admin, cli, statestore-check, statestore-clean, 
> historystore-manager, classpath}}
>  {{service values: standalone, cluster-master, cluster-worker, aws, yarn, mr, 
> service}}
> with above change, following becomes valid command.
> {code:java}
> # all under GobblinCli class
> gobblin run listQuickApps  –> gobblin cli run listQuickApps <params>
> gobblin run <quick-app-name> -> gobblin cli run <quick-app-name> <params>
> # class: JobStateToJsonConverter
> statestore-checker.sh <args> -> gobblin cli job-state-to-json <params>
> # class: StateStoreCleaner
> statestore-clean.sh <args> -> the class is depricated so no need to migrate 
> this over.
> # class: DatabaseJobHistoryStoreSchemaManager
> historystore-manager.sh <args> -> gobblin cli job-store-schema-manager 
> <params>
> # class: Cli
> gobblin-admin.sh <args>   -> gobblin cli admin <args>
> # all gobblin deployment modes
> gobblin-cluster-master.sh   -> gobblin service cluster-master 
> start|stop|status
> gobblin-cluster-worker.sh   -> gobblin service cluster-worker 
> start|stop|status
> gobblin-compaction.sh       -> gobblin-compaction.sh  ( kept as it is for 
> now, can be migrated to new script framework)
> gobblin-mapreduce.sh        -> gobblin service mapreduce start|stop|status
> gobblin-service.sh               -> gobblin service service-manager 
> start|stop|status
> gobblin-standalone.sh        -> gobblin service standalone start|stop|status
> gobblin-yarn.sh                   -> gobblin service yarn start|stop|status
> {code}
>  
> 2. Also all configurations for each mode needs to be structured and de-duped 
> accordingly to make it clear on which config will be picked up for which 
> execution mode. This would be well defined in command help instructions.
>  {color:#ff0000}
>  NOTE: this refactoring adds all cli and service commands to gobblin.sh and 
> hence changes the syntax for all commands and services.{color}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

Reply via email to