[jira] [Commented] (GOBBLIN-707) combine & standardize all gobblin scripts into one master script & restructure configs accordingly
[ https://issues.apache.org/jira/browse/GOBBLIN-707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16878203#comment-16878203 ] Jay Sen commented on GOBBLIN-707: - [~ibuenros], pls take a look. > combine & standardize all gobblin scripts into one master script & > restructure configs accordingly > -- > > Key: GOBBLIN-707 > URL: https://issues.apache.org/jira/browse/GOBBLIN-707 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Jay Sen >Priority: Major > Time Spent: 9h > Remaining Estimate: 0h > > gobblin supports multiple modes of executions ( CLI, Standalone, > cluster-master, cluster-worker, AWS, YARN, MR ) and various command lines > utility to run cli and admin commands. There is a individual script for each > of them. > Having individual script introduces lot of issues > # all scripts handles gobblin variables, user parameters differently, and > its highly inconsistent among various different gobblin scripts > # functionality around start, stop, status checking and handling PID's among > lot of other things, varies vastly as per the implementation of the script. > # features like GC & JVM params, log4j file selection, classpath > calculation, etc... exists in some gobblin scripts but not all, adding to > inconsistent user experience. > # maintaining total 13 script would be too much effort. > Also all the gobblin scripts share lot of common code to handle params, > start, stop services, status checks, pid handling, etc... combining all the > scripts into 1 not only makes maintenance easier but also brings clarity and > consistency. > > Solution: > 1. there can be one gobblin.sh script to handle all gobblin commands and > deployment options as per following signature. NOTE: This > {{gobblin.sh }} > {{gobblin.sh }} > {{commands values: admin, cli, statestore-check, statestore-clean, > historystore-manager, classpath}} > {{service values: standalone, cluster-master, cluster-worker, aws, yarn, mr, > service}} > with above change, following becomes valid command. > {code:java} > # all under GobblinCli class > gobblin run listQuickApps –> gobblin cli run listQuickApps > gobblin run listQuickApps –> gobblin cli run listQuickApps > gobblin run -> gobblin cli run > # class: JobStateToJsonConverter > statestore-checker.sh -> gobblin statestore-checker > # class: StateStoreCleaner > statestore-clean.sh -> gobblin statestore-clean > # class: DatabaseJobHistoryStoreSchemaManager > historystore-manager.sh -> gobblin historystore-manager > # class: Cli > gobblin-admin.sh-> gobblin admin > # all gobblin deployment modes > gobblin-cluster-master.sh -> gobblin cluster-mater start|stop|status > gobblin-cluster-worker.sh -> gobblin cluster-mater start|stop|status > gobblin-compaction.sh -> gobblin cluster-mater start|stop|status > gobblin-env.sh -> gobblin cluster-mater start|stop|status > gobblin-mapreduce.sh-> gobblin cluster-mater start|stop|status > gobblin-service.sh -> gobblin cluster-mater start|stop|status > gobblin-standalone.sh -> gobblin cluster-mater start|stop|status > gobblin-yarn.sh -> gobblin cluster-mater start|stop|status > {code} > > 2. Also configs needs to be structured and deduped accordingly to make it > clear on which config will be picked up for which execution mode. > {color:#ff} > NOTE: this refactoring adds all cli and service commands to gobblin.sh and > hence changes the syntax for all commands and services.{color} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (GOBBLIN-707) combine & standardize all gobblin scripts into one master script & restructure configs accordingly
[ https://issues.apache.org/jira/browse/GOBBLIN-707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16843281#comment-16843281 ] Jay Sen commented on GOBBLIN-707: - updated to the following: {code} ➜ gobblin-dist ./bin/gobblin Usage: gobblin.sh cli gobblin.sh service Use "gobblin --help" for more information. (Gobblin Version: 0.15.0) ➜ gobblin-dist ./bin/gobblin cli Usage: gobblin.sh cli options: cli-commands: passwordManager Encrypt or decrypt strings for the password manager. decrypt Decryption utilities run Run a Gobblin application. config Query the config library jobsCommand line job info and operations stateMigration Command line tools for migrating state store job-state-to-json To convert Job state to JSON cleaner Data retention utility keystoreExamine JCE Keystore files watermarks Inspect streaming watermarks job-store-schema-managerDatabase job history store schema manager --conf-dir Gobblon config path. default is '$GOBBLIN_HOME/conf/'. --log4j-conf default is '$GOBBLIN_HOME/conf//log4j.properties'. --jvmopts String containing JVM flags to include, in addition to "-Xmx1g -Xms512m". --jars Column-separated list of extra jars to put on the CLASSPATH. --enable-gc-logs enables gc logs & dumps. --show-classpath prints gobblin runtime classpath. --help Display this help. --verbose Display full command used to start the process. Gobblin Version: 0.15.0 ➜ gobblin-dist ./bin/gobblin service Usage: gobblin.sh service Argument Options: standalone, cluster-master, cluster-worker, aws, yarn, mapreduce, service-manager. --conf-dir Gobblon config path. default is '$GOBBLIN_HOME/conf/'. --log4j-conf default is '$GOBBLIN_HOME/conf//log4j.properties'. --jvmopts String containing JVM flags to include, in addition to "-Xmx1g -Xms512m". --jars Column-separated list of extra jars to put on the CLASSPATH. --enable-gc-logs enables gc logs & dumps. --show-classpath prints gobblin runtime classpath. --cluster-name Name of the cluster to be used by helix & other services. ( default: gobblin_cluster). --jt Only for mapreduce mode: Job submission URL, if not set, taken from ${HADOOP_HOME}/conf. --fs Only for mapreduce mode: Target file system, if not set, taken from ${HADOOP_HOME}/conf. --help Display this help. --verbose Display full command used to start the process. Gobblin Version: 0.15.0 {code} > combine & standardize all gobblin scripts into one master script & > restructure configs accordingly > -- > > Key: GOBBLIN-707 > URL: https://issues.apache.org/jira/browse/GOBBLIN-707 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Jay Sen >Priority: Major > Time Spent: 6h > Remaining Estimate: 0h > > gobblin supports multiple modes of executions ( CLI, Standalone, > cluster-master, cluster-worker, AWS, YARN, MR ) and various command lines > utility to run cli and admin commands. There is a individual script for each > of them. > Having individual script introduces lot of issues > # all scripts handles gobblin variables, user parameters differently, and > its highly inconsistent among various different gobblin scripts > # functionality around start, stop, status checking and handling PID's among > lot of other things, varies vastly as per the implementation of the script. > # features like GC & JVM params, log4j file selection, classpath > calculation, etc... exists in some gobblin scripts but not all, adding to > inconsistent user experience. > # maintaining total 13 script would be too much effort. > Also all the gobblin scripts share lot of common code to handle params, > start, stop services, status checks, pid handling, etc... combining all the > scripts into 1 not only makes maintenance easier but also brings clarity and > consistency. > > Solution: > 1. there can be one gobblin.sh script to handle all gobblin commands and > deployme
[jira] [Commented] (GOBBLIN-707) combine & standardize all gobblin scripts into one master script & restructure configs accordingly
[ https://issues.apache.org/jira/browse/GOBBLIN-707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16840878#comment-16840878 ] Jay Sen commented on GOBBLIN-707: - Hi [~ibuenros], can you pls take a look, and I will push the commit. Thanks. If it helps, Can we meet online and figure this out to expedite ? > combine & standardize all gobblin scripts into one master script & > restructure configs accordingly > -- > > Key: GOBBLIN-707 > URL: https://issues.apache.org/jira/browse/GOBBLIN-707 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Jay Sen >Priority: Major > Time Spent: 5h 50m > Remaining Estimate: 0h > > gobblin supports multiple modes of executions ( CLI, Standalone, > cluster-master, cluster-worker, AWS, YARN, MR ) and various command lines > utility to run cli and admin commands. There is a individual script for each > of them. > Having individual script introduces lot of issues > # all scripts handles gobblin variables, user parameters differently, and > its highly inconsistent among various different gobblin scripts > # functionality around start, stop, status checking and handling PID's among > lot of other things, varies vastly as per the implementation of the script. > # features like GC & JVM params, log4j file selection, classpath > calculation, etc... exists in some gobblin scripts but not all, adding to > inconsistent user experience. > # maintaining total 13 script would be too much effort. > Also all the gobblin scripts share lot of common code to handle params, > start, stop services, status checks, pid handling, etc... combining all the > scripts into 1 not only makes maintenance easier but also brings clarity and > consistency. > > Solution: > 1. there can be one gobblin.sh script to handle all gobblin commands and > deployment options as per following signature. NOTE: This > {{gobblin.sh }} > {{gobblin.sh }} > {{commands values: admin, cli, statestore-check, statestore-clean, > historystore-manager, classpath}} > {{service values: standalone, cluster-master, cluster-worker, aws, yarn, mr, > service}} > with above change, following becomes valid command. > {code:java} > # all under GobblinCli class > gobblin run listQuickApps –> gobblin cli run listQuickApps > gobblin run listQuickApps –> gobblin cli run listQuickApps > gobblin run -> gobblin cli run > # class: JobStateToJsonConverter > statestore-checker.sh -> gobblin statestore-checker > # class: StateStoreCleaner > statestore-clean.sh -> gobblin statestore-clean > # class: DatabaseJobHistoryStoreSchemaManager > historystore-manager.sh -> gobblin historystore-manager > # class: Cli > gobblin-admin.sh-> gobblin admin > # all gobblin deployment modes > gobblin-cluster-master.sh -> gobblin cluster-mater start|stop|status > gobblin-cluster-worker.sh -> gobblin cluster-mater start|stop|status > gobblin-compaction.sh -> gobblin cluster-mater start|stop|status > gobblin-env.sh -> gobblin cluster-mater start|stop|status > gobblin-mapreduce.sh-> gobblin cluster-mater start|stop|status > gobblin-service.sh -> gobblin cluster-mater start|stop|status > gobblin-standalone.sh -> gobblin cluster-mater start|stop|status > gobblin-yarn.sh -> gobblin cluster-mater start|stop|status > {code} > > 2. Also configs needs to be structured and deduped accordingly to make it > clear on which config will be picked up for which execution mode. > {color:#ff} > NOTE: this refactoring adds all cli and service commands to gobblin.sh and > hence changes the syntax for all commands and services.{color} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (GOBBLIN-707) combine & standardize all gobblin scripts into one master script & restructure configs accordingly
[ https://issues.apache.org/jira/browse/GOBBLIN-707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16836792#comment-16836792 ] Jay Sen commented on GOBBLIN-707: - Hi [~ibuenros], any comment here? > combine & standardize all gobblin scripts into one master script & > restructure configs accordingly > -- > > Key: GOBBLIN-707 > URL: https://issues.apache.org/jira/browse/GOBBLIN-707 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Jay Sen >Priority: Major > Time Spent: 5h 50m > Remaining Estimate: 0h > > gobblin supports multiple modes of executions ( CLI, Standalone, > cluster-master, cluster-worker, AWS, YARN, MR ) and various command lines > utility to run cli and admin commands. There is a individual script for each > of them. > Having individual script introduces lot of issues > # all scripts handles gobblin variables, user parameters differently, and > its highly inconsistent among various different gobblin scripts > # functionality around start, stop, status checking and handling PID's among > lot of other things, varies vastly as per the implementation of the script. > # features like GC & JVM params, log4j file selection, classpath > calculation, etc... exists in some gobblin scripts but not all, adding to > inconsistent user experience. > # maintaining total 13 script would be too much effort. > Also all the gobblin scripts share lot of common code to handle params, > start, stop services, status checks, pid handling, etc... combining all the > scripts into 1 not only makes maintenance easier but also brings clarity and > consistency. > > Solution: > 1. there can be one gobblin.sh script to handle all gobblin commands and > deployment options as per following signature. NOTE: This > {{gobblin.sh }} > {{gobblin.sh }} > {{commands values: admin, cli, statestore-check, statestore-clean, > historystore-manager, classpath}} > {{service values: standalone, cluster-master, cluster-worker, aws, yarn, mr, > service}} > with above change, following becomes valid command. > {code:java} > # all under GobblinCli class > gobblin run listQuickApps –> gobblin cli run listQuickApps > gobblin run listQuickApps –> gobblin cli run listQuickApps > gobblin run -> gobblin cli run > # class: JobStateToJsonConverter > statestore-checker.sh -> gobblin statestore-checker > # class: StateStoreCleaner > statestore-clean.sh -> gobblin statestore-clean > # class: DatabaseJobHistoryStoreSchemaManager > historystore-manager.sh -> gobblin historystore-manager > # class: Cli > gobblin-admin.sh-> gobblin admin > # all gobblin deployment modes > gobblin-cluster-master.sh -> gobblin cluster-mater start|stop|status > gobblin-cluster-worker.sh -> gobblin cluster-mater start|stop|status > gobblin-compaction.sh -> gobblin cluster-mater start|stop|status > gobblin-env.sh -> gobblin cluster-mater start|stop|status > gobblin-mapreduce.sh-> gobblin cluster-mater start|stop|status > gobblin-service.sh -> gobblin cluster-mater start|stop|status > gobblin-standalone.sh -> gobblin cluster-mater start|stop|status > gobblin-yarn.sh -> gobblin cluster-mater start|stop|status > {code} > > 2. Also configs needs to be structured and deduped accordingly to make it > clear on which config will be picked up for which execution mode. > {color:#ff} > NOTE: this refactoring adds all cli and service commands to gobblin.sh and > hence changes the syntax for all commands and services.{color} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (GOBBLIN-707) combine & standardize all gobblin scripts into one master script & restructure configs accordingly
[ https://issues.apache.org/jira/browse/GOBBLIN-707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16832968#comment-16832968 ] Jay Sen commented on GOBBLIN-707: - with everything in gobblin.sh, I have added following ways, please let me know your comment. {code} Unable to find source-code formatter for language: bash. Available languages are: actionscript, ada, applescript, bash, c, c#, c++, cpp, css, erlang, go, groovy, haskell, html, java, javascript, js, json, lua, none, nyan, objc, perl, php, python, r, rainbow, ruby, scala, sh, sql, swift, visualbasic, xml, yaml➜ gobblin-dist ./bin/gobblin gobblin.sh cli gobblin.sh service Use "gobblin --help" for more information ➜ gobblin-dist ./bin/gobblin cli gobblin.sh cli options: cli-commands: admin, jobs, statestore-check, statestore-clean, historystore-manager --conf-dir Gobblon config path. default is '$GOBBLIN_HOME/conf/'. --log4j-conf default is '$GOBBLIN_HOME/conf//log4j.properties'. --jvmopts String containing JVM flags to include, in addition to "-Xmx1g -Xms512m". --jars Column-separated list of extra jars to put on the CLASSPATH. --enable-gc-logs enables gc logs & dumps. --show-classpath prints gobblin runtime classpath. --help Display this help. --verbose Display full command used to start the process. Gobblin Version: 0.15.0 ➜ gobblin-dist ./bin/gobblin service gobblin.sh service Argument Options: standalone, cluster-master, cluster-worker, aws, yarn, mapreduce, service-manager. --cluster-name Name of the cluster to be used by helix & other services. ( default: gobblin_cluster). --conf-dir Gobblon config path. default is '$GOBBLIN_HOME/conf/'. --log4j-conf default is '$GOBBLIN_HOME/conf//log4j.properties'. --jvmopts String containing JVM flags to include, in addition to "-Xmx1g -Xms512m". --jars Column-separated list of extra jars to put on the CLASSPATH. --enable-gc-logs enables gc logs & dumps. --show-classpath prints gobblin runtime classpath. --jt Only for mapreduce mode: Job submission URL, if not set, taken from ${HADOOP_HOME}/conf. --fs Only for mapreduce mode: Target file system, if not set, taken from ${HADOOP_HOME}/conf. --help Display this help. --verbose Display full command used to start the process. Gobblin Version: 0.15.0 {code} > combine & standardize all gobblin scripts into one master script & > restructure configs accordingly > -- > > Key: GOBBLIN-707 > URL: https://issues.apache.org/jira/browse/GOBBLIN-707 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Jay Sen >Priority: Major > Time Spent: 5h 50m > Remaining Estimate: 0h > > gobblin supports multiple modes of executions ( CLI, Standalone, > cluster-master, cluster-worker, AWS, YARN, MR ) and various command lines > utility to run cli and admin commands. There is a individual script for each > of them. > Having individual script introduces lot of issues > # all scripts handles gobblin variables, user parameters differently, and > its highly inconsistent among various different gobblin scripts > # functionality around start, stop, status checking and handling PID's among > lot of other things, varies vastly as per the implementation of the script. > # features like GC & JVM params, log4j file selection, classpath > calculation, etc... exists in some gobblin scripts but not all, adding to > inconsistent user experience. > # maintaining total 13 script would be too much effort. > Also all the gobblin scripts share lot of common code to handle params, > start, stop services, status checks, pid handling, etc... combining all the > scripts into 1 not only makes maintenance easier but also brings clarity and > consistency. > > Solution: > 1. there can be one gobblin.sh script to handle all gobblin commands and > deployment options as per following signature. NOTE: This > {{gobblin.sh }} > {{gobblin.sh }} > {{commands values: admin, cli, statestore-check, statestore-clean, > historystore-manager, classpath}} > {{service values: standalone, cluster-master, cluster-worker, aws, yarn, mr, > service}} > with above change, following becomes valid command. > {code:java} > # all under GobblinCli class > gobblin run listQuickApps –> gobblin cli run listQuickApps > gobblin run listQuickApps
[jira] [Commented] (GOBBLIN-707) combine & standardize all gobblin scripts into one master script & restructure configs accordingly
[ https://issues.apache.org/jira/browse/GOBBLIN-707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16832957#comment-16832957 ] Jay Sen commented on GOBBLIN-707: - + comment from the git, for more clarity on what you are suggesting. {quote}Can we leave {{gobblin.sh}} relatively simple and instead have {{gobblin-cli.sh}} and {{gobblin-service.sh}}? {{gobblin.sh}} would just redirect to the correct place depending on the first argument {quote} This could also be done, but it would add duplicate the code for handling options (conf, jvmopts, etc...) and classpath building. Basically, pretty much all options of gobblin-cli scripts is duplicated from gobblin-services ( which needs all options) as shown below: gobblin cli --help gobblin cli cli-commands : admin, jobs, statestore-check, statestore-clean, historystore-manager params : respective parameters for the commands other_options: --conf-dir Gobblon config path. default is '$GOBBLIN_HOME/conf/'. --jvmopts String containing JVM flags to include, in addition to "-Xmx1g -Xms512m". --jars Column-separated list of extra jars to put on the CLASSPATH. --enable-gc-logs enables gc logs & dumps. --show-classpath prints gobblin runtime classpath. --help Display this help. --verbose Display full command used to start the process. gobblin services --help gobblin service execution-modes : standalone, cluster-master, cluster-worker, aws, yarn, mapreduce, service-manager. other_options: --cluster-name Name of the cluster to be used by helix & other services. ( default: gobblin_cluster). --conf-dir Gobblon config path. default is '$GOBBLIN_HOME/conf/'. --log4j-conf default is '$GOBBLIN_HOME/conf//log4j.properties'. --jvmopts String containing JVM flags to include, in addition to "-Xmx1g -Xms512m". --jars Column-separated list of extra jars to put on the CLASSPATH. --enable-gc-logs enables gc logs & dumps. --show-classpath prints gobblin runtime classpath. --jt Only for mapreduce mode: Job submission URL, if not set, taken from ${HADOOP_HOME}/conf. --fs Only for mapreduce mode: Target file system, if not set, taken from ${HADOOP_HOME}/conf. --help Display this help. --verbose Display full command used to start the process. If we keep all the code common to handle options and other things then that is pretty much what I have done in gobblin.sh, may be i can just separate out the help message for cli and services so it will be more clear abut options for each and aligns with what you are suggesting and then later on i can also try to bring in java classes under GobblinCli as a separate PR otherwise this PR will keep growing... :) Let me know if you think otherwise, and I will think about how to make that change. Thanks Jay > combine & standardize all gobblin scripts into one master script & > restructure configs accordingly > -- > > Key: GOBBLIN-707 > URL: https://issues.apache.org/jira/browse/GOBBLIN-707 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Jay Sen >Priority: Major > Time Spent: 5h 50m > Remaining Estimate: 0h > > gobblin supports multiple modes of executions ( CLI, Standalone, > cluster-master, cluster-worker, AWS, YARN, MR ) and various command lines > utility to run cli and admin commands. There is a individual script for each > of them. > Having individual script introduces lot of issues > # all scripts handles gobblin variables, user parameters differently, and > its highly inconsistent among various different gobblin scripts > # functionality around start, stop, status checking and handling PID's among > lot of other things, varies vastly as per the implementation of the script. > # features like GC & JVM params, log4j file selection, classpath > calculation, etc... exists in some gobblin scripts but not all, adding to > inconsistent user experience. > # maintaining total 13 script would be too much effort. > Also all the gobblin scripts share lot of common code to handle params, > start, stop services, status checks, pid handling, etc... combining all the > scripts into 1 not only makes maintenance easier but also brings clarity and > consistency. > > Solution: > 1. there can be one gobblin.sh script to handle all gobblin commands and > deployment options as per following signature. NOTE: This > {{gobblin.sh }} > {{gobblin.sh }} > {{commands values: admin, cli, statestore-check, statestore-clean, > historystore-manager, classpath}} > {{service values: standalone, cluster-master, cluster-worker, aws, yarn, mr, > service}} > with above change, following becomes valid command. > {code:java} > # all under GobblinCli class > gobblin run listQuickApps –> gobblin cli run listQuickApps > gobbli
[jira] [Commented] (GOBBLIN-707) combine & standardize all gobblin scripts into one master script & restructure configs accordingly
[ https://issues.apache.org/jira/browse/GOBBLIN-707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16832912#comment-16832912 ] Issac Buenrostro commented on GOBBLIN-707: -- I see, didn't realize there was so much added to `gobblin.cli`. Can we do this to avoid confusing what options apply to each mode? {code:java} gobblin --help gobblin cli gobblin service Use "gobblin --help" for more information {code} > combine & standardize all gobblin scripts into one master script & > restructure configs accordingly > -- > > Key: GOBBLIN-707 > URL: https://issues.apache.org/jira/browse/GOBBLIN-707 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Jay Sen >Priority: Major > Time Spent: 5h 40m > Remaining Estimate: 0h > > gobblin supports multiple modes of executions ( CLI, Standalone, > cluster-master, cluster-worker, AWS, YARN, MR ) and various command lines > utility to run cli and admin commands. There is a individual script for each > of them. > Having individual script introduces lot of issues > # all scripts handles gobblin variables, user parameters differently, and > its highly inconsistent among various different gobblin scripts > # functionality around start, stop, status checking and handling PID's among > lot of other things, varies vastly as per the implementation of the script. > # features like GC & JVM params, log4j file selection, classpath > calculation, etc... exists in some gobblin scripts but not all, adding to > inconsistent user experience. > # maintaining total 13 script would be too much effort. > Also all the gobblin scripts share lot of common code to handle params, > start, stop services, status checks, pid handling, etc... combining all the > scripts into 1 not only makes maintenance easier but also brings clarity and > consistency. > > Solution: > 1. there can be one gobblin.sh script to handle all gobblin commands and > deployment options as per following signature. NOTE: This > {{gobblin.sh }} > {{gobblin.sh }} > {{commands values: admin, cli, statestore-check, statestore-clean, > historystore-manager, classpath}} > {{service values: standalone, cluster-master, cluster-worker, aws, yarn, mr, > service}} > with above change, following becomes valid command. > {code:java} > # all under GobblinCli class > gobblin run listQuickApps –> gobblin cli run listQuickApps > gobblin run listQuickApps –> gobblin cli run listQuickApps > gobblin run -> gobblin cli run > # class: JobStateToJsonConverter > statestore-checker.sh -> gobblin statestore-checker > # class: StateStoreCleaner > statestore-clean.sh -> gobblin statestore-clean > # class: DatabaseJobHistoryStoreSchemaManager > historystore-manager.sh -> gobblin historystore-manager > # class: Cli > gobblin-admin.sh-> gobblin admin > # all gobblin deployment modes > gobblin-cluster-master.sh -> gobblin cluster-mater start|stop|status > gobblin-cluster-worker.sh -> gobblin cluster-mater start|stop|status > gobblin-compaction.sh -> gobblin cluster-mater start|stop|status > gobblin-env.sh -> gobblin cluster-mater start|stop|status > gobblin-mapreduce.sh-> gobblin cluster-mater start|stop|status > gobblin-service.sh -> gobblin cluster-mater start|stop|status > gobblin-standalone.sh -> gobblin cluster-mater start|stop|status > gobblin-yarn.sh -> gobblin cluster-mater start|stop|status > {code} > > 2. Also configs needs to be structured and deduped accordingly to make it > clear on which config will be picked up for which execution mode. > {color:#ff} > NOTE: this refactoring adds all cli and service commands to gobblin.sh and > hence changes the syntax for all commands and services.{color} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (GOBBLIN-707) combine & standardize all gobblin scripts into one master script & restructure configs accordingly
[ https://issues.apache.org/jira/browse/GOBBLIN-707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16832130#comment-16832130 ] Jay Sen commented on GOBBLIN-707: - Sure, that would be even better but that will required even further refactoring of the Java classes for statestore-checker and others to bring them under the {{Alias}} and make it go through {{GobblinCli}}, I will do that once you confirm the following syntax in that case: {code} gobblin --help gobblin.sh cli gobblin.sh service Argument Options: admin, jobs, statestore-check, statestore-clean, historystore-manager standalone, cluster-master, cluster-worker, aws, yarn, mapreduce, service-manager. --cluster-name Name of the cluster to be used by helix & other services. ( default: gobblin_cluster). --conf-dir Gobblon config path. default is '$GOBBLIN_HOME/conf/'. --log4j-conf default is '$GOBBLIN_HOME/conf//log4j.properties'. --jvmopts String containing JVM flags to include, in addition to "-Xmx1g -Xms512m". --jars Column-separated list of extra jars to put on the CLASSPATH. --enable-gc-logs enables gc logs & dumps. --show-classpath prints gobblin runtime classpath. --jt Only for mapreduce mode: Job submission URL, if not set, taken from ${HADOOP_HOME}/conf. --fs Only for mapreduce mode: Target file system, if not set, taken from ${HADOOP_HOME}/conf. --help Display this help. --verbose Display full command used to start the process. Gobblin Version: 0.15.0 {code} btw, all the removed scripts is been incorporated into above gobblin.sh changes in one or other way, I will double check on that anyway. > combine & standardize all gobblin scripts into one master script & > restructure configs accordingly > -- > > Key: GOBBLIN-707 > URL: https://issues.apache.org/jira/browse/GOBBLIN-707 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Jay Sen >Priority: Major > Time Spent: 5h 40m > Remaining Estimate: 0h > > gobblin supports multiple modes of executions ( CLI, Standalone, > cluster-master, cluster-worker, AWS, YARN, MR ) and various command lines > utility to run cli and admin commands. There is a individual script for each > of them. > Having individual script introduces lot of issues > # all scripts handles gobblin variables, user parameters differently, and > its highly inconsistent among various different gobblin scripts > # functionality around start, stop, status checking and handling PID's among > lot of other things, varies vastly as per the implementation of the script. > # features like GC & JVM params, log4j file selection, classpath > calculation, etc... exists in some gobblin scripts but not all, adding to > inconsistent user experience. > # maintaining total 13 script would be too much effort. > Also all the gobblin scripts share lot of common code to handle params, > start, stop services, status checks, pid handling, etc... combining all the > scripts into 1 not only makes maintenance easier but also brings clarity and > consistency. > > Solution: > 1. there can be one gobblin.sh script to handle all gobblin commands and > deployment options as per following signature. NOTE: This > {{gobblin.sh }} > {{gobblin.sh }} > {{commands values: admin, cli, statestore-check, statestore-clean, > historystore-manager, classpath}} > {{service values: standalone, cluster-master, cluster-worker, aws, yarn, mr, > service}} > with above change, following becomes valid command. > {code:java} > # all under GobblinCli class > gobblin run listQuickApps –> gobblin cli run listQuickApps > gobblin run listQuickApps –> gobblin cli run listQuickApps > gobblin run -> gobblin cli run > # class: JobStateToJsonConverter > statestore-checker.sh -> gobblin statestore-checker > # class: StateStoreCleaner > statestore-clean.sh -> gobblin statestore-clean > # class: DatabaseJobHistoryStoreSchemaManager > historystore-manager.sh -> gobblin historystore-manager > # class: Cli > gobblin-admin.sh-> gobblin admin > # all gobblin deployment modes > gobblin-cluster-master.sh -> gobblin cluster-mater start|stop|status > gobblin-cluster-worker.sh -> gobblin cluster-mater start|stop|status > gobblin-compaction.sh -> gobblin cluster-mater start|stop|status > gobblin-env.sh -> gobblin cluster-mater start|stop|status > gobblin-mapreduce.sh-> gobblin cluster-mater start|stop|status > gobblin-service.sh -> gobblin cluster-mater start|stop|status > gobblin-standalone.sh -> gobblin cluster-mater start|stop|status > gobblin-yarn.sh -> gobblin cluster-mater start|stop|status > {code} > > 2. Also configs needs to be structured and deduped accordingly to make it > clear on which config
[jira] [Commented] (GOBBLIN-707) combine & standardize all gobblin scripts into one master script & restructure configs accordingly
[ https://issues.apache.org/jira/browse/GOBBLIN-707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16832080#comment-16832080 ] Issac Buenrostro commented on GOBBLIN-707: -- Thanks for taking this up [~jaysen] I do see the point of cleaning up the multiple scripts that Gobblin has, however I would challenge that the cleanup should be a bit different. As you pointed out there are two types of scripts: commands and services. * For commands, the scripts are always pretty much identical, so I believe the access should always be through `GobblinCli` (i.e. implemented as `CliApplication`s). This means that instead of `gobblin statestore-checker` it should be `gobblin cli statestore-checker` and have the bash portion of the script be unique. This has the advantage that `gobblin cli --help` will list all commands, and commands are self-documenting by using the `@Alias` annotation, and even better if we use `ConstructorAndPublicMethodsCliObjectFactory` which will automatically create a help string for each one, and allow programmatic and cli access with the same input. * For services, I'm not sure how you're approaching things, but it would also be nice to have a single bash script that can handle all of them (given that, as you pointed out, they are all of the form `start|stop|status`). Re: the PR, I'm a bit confused because a lot of scripts were removed but I don't understand where the replacements are. I may be missing something obvious, and I apologize if that is the case :) > combine & standardize all gobblin scripts into one master script & > restructure configs accordingly > -- > > Key: GOBBLIN-707 > URL: https://issues.apache.org/jira/browse/GOBBLIN-707 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Jay Sen >Priority: Major > Time Spent: 5h 40m > Remaining Estimate: 0h > > gobblin supports multiple modes of executions ( CLI, Standalone, > cluster-master, cluster-worker, AWS, YARN, MR ) and various command lines > utility to run cli and admin commands. There is a individual script for each > of them. > Having individual script introduces lot of issues > # all scripts handles gobblin variables, user parameters differently, and > its highly inconsistent among various different gobblin scripts > # functionality around start, stop, status checking and handling PID's among > lot of other things, varies vastly as per the implementation of the script. > # features like GC & JVM params, log4j file selection, classpath > calculation, etc... exists in some gobblin scripts but not all, adding to > inconsistent user experience. > # maintaining total 13 script would be too much effort. > Also all the gobblin scripts share lot of common code to handle params, > start, stop services, status checks, pid handling, etc... combining all the > scripts into 1 not only makes maintenance easier but also brings clarity and > consistency. > > Solution: > 1. there can be one gobblin.sh script to handle all gobblin commands and > deployment options as per following signature. NOTE: This > {{gobblin.sh }} > {{gobblin.sh }} > {{commands values: admin, cli, statestore-check, statestore-clean, > historystore-manager, classpath}} > {{service values: standalone, cluster-master, cluster-worker, aws, yarn, mr, > service}} > with above change, following becomes valid command. > {code:java} > # all under GobblinCli class > gobblin run listQuickApps –> gobblin cli run listQuickApps > gobblin run listQuickApps –> gobblin cli run listQuickApps > gobblin run -> gobblin cli run > # class: JobStateToJsonConverter > statestore-checker.sh -> gobblin statestore-checker > # class: StateStoreCleaner > statestore-clean.sh -> gobblin statestore-clean > # class: DatabaseJobHistoryStoreSchemaManager > historystore-manager.sh -> gobblin historystore-manager > # class: Cli > gobblin-admin.sh-> gobblin admin > # all gobblin deployment modes > gobblin-cluster-master.sh -> gobblin cluster-mater start|stop|status > gobblin-cluster-worker.sh -> gobblin cluster-mater start|stop|status > gobblin-compaction.sh -> gobblin cluster-mater start|stop|status > gobblin-env.sh -> gobblin cluster-mater start|stop|status > gobblin-mapreduce.sh-> gobblin cluster-mater start|stop|status > gobblin-service.sh -> gobblin cluster-mater start|stop|status > gobblin-standalone.sh -> gobblin cluster-mater start|stop|status > gobblin-yarn.sh -> gobblin cluster-mater start|stop|status > {code} > > 2. Also configs needs to be structured and deduped accordingly to make it > clear on which config will be picked up for which execution mode. > > {color:#FF} > NOTE: this refactoring t