[jira] [Created] (GOBBLIN-1205) restarting gobblin on yarn fails with error
Jay Sen created GOBBLIN-1205: Summary: restarting gobblin on yarn fails with error Key: GOBBLIN-1205 URL: https://issues.apache.org/jira/browse/GOBBLIN-1205 Project: Apache Gobblin Issue Type: Bug Affects Versions: 0.15.0 Reporter: Jay Sen Fix For: 0.15.0 restarting gobblin deployed on yarn mode occasionally fails starting up with following error, may be the path is still on hold by the previous process, it may just need bit time between stop/start. {code:java} WARN [ZKHelixAdmin] Root directory exists.Cleaning the root directory:/GobblinYarnHelixAppWARN [ZKHelixAdmin] Root directory exists.Cleaning the root directory:/GobblinYarnHelixAppWARN [ZkClient] Failed to delete path /GobblinYarnHelixApp/CONTROLLER! org.I0Itec.zkclient.exception.ZkException: org.apache.zookeeper.KeeperException$NotEmptyException: KeeperErrorCode = Directory not empty for /GobblinYarnHelixApp/CONTROLLERERROR [ZkClient] Failed to delete /GobblinYarnHelixApp/CONTROLLERorg.I0Itec.zkclient.exception.ZkException: org.apache.zookeeper.KeeperException$NotEmptyException: KeeperErrorCode = Directory not empty for /GobblinYarnHelixApp/CONTROLLER at org.I0Itec.zkclient.exception.ZkException.create(ZkException.java:68) at org.apache.helix.manager.zk.zookeeper.ZkClient.retryUntilConnected(ZkClient.java:1160) at org.apache.helix.manager.zk.zookeeper.ZkClient.delete(ZkClient.java:1215) at org.apache.helix.manager.zk.zookeeper.ZkClient.deleteRecursively(ZkClient.java:949) at org.apache.helix.manager.zk.zookeeper.ZkClient.deleteRecursively(ZkClient.java:942) at org.apache.helix.manager.zk.ZKHelixAdmin.addCluster(ZKHelixAdmin.java:698) at org.apache.helix.tools.ClusterSetup.addCluster(ClusterSetup.java:162) at org.apache.gobblin.cluster.HelixUtils.createGobblinHelixCluster(HelixUtils.java:96) at org.apache.gobblin.yarn.GobblinYarnAppLauncher.launch(GobblinYarnAppLauncher.java:337) at org.apache.gobblin.yarn.GobblinYarnAppLauncher.main(GobblinYarnAppLauncher.java:1067)Caused by: org.apache.zookeeper.KeeperException$NotEmptyException: KeeperErrorCode = Directory not empty for /GobblinYarnHelixApp/CONTROLLER at org.apache.zookeeper.KeeperException.create(KeeperException.java:128) at org.apache.zookeeper.KeeperException.create(KeeperException.java:54) at org.apache.zookeeper.ZooKeeper.delete(ZooKeeper.java:882) at org.apache.helix.manager.zk.zookeeper.ZkConnection.delete(ZkConnection.java:119) at org.apache.helix.manager.zk.zookeeper.ZkClient$9.call(ZkClient.java:1219) at org.apache.helix.manager.zk.zookeeper.ZkClient.retryUntilConnected(ZkClient.java:1150) ... 8 more ==> logs/yarn.err <==Exception in thread "main" org.apache.helix.HelixException: Failed to delete /GobblinYarnHelixApp/CONTROLLER at org.apache.helix.manager.zk.zookeeper.ZkClient.deleteRecursively(ZkClient.java:952) at org.apache.helix.manager.zk.zookeeper.ZkClient.deleteRecursively(ZkClient.java:942) at org.apache.helix.manager.zk.ZKHelixAdmin.addCluster(ZKHelixAdmin.java:698) at org.apache.helix.tools.ClusterSetup.addCluster(ClusterSetup.java:162) at org.apache.gobblin.cluster.HelixUtils.createGobblinHelixCluster(HelixUtils.java:96) at org.apache.gobblin.yarn.GobblinYarnAppLauncher.launch(GobblinYarnAppLauncher.java:337) at org.apache.gobblin.yarn.GobblinYarnAppLauncher.main(GobblinYarnAppLauncher.java:1067)Caused by: org.I0Itec.zkclient.exception.ZkException: org.apache.zookeeper.KeeperException$NotEmptyException: KeeperErrorCode = Directory not empty for /GobblinYarnHelixApp/CONTROLLER at org.I0Itec.zkclient.exception.ZkException.create(ZkException.java:68) at org.apache.helix.manager.zk.zookeeper.ZkClient.retryUntilConnected(ZkClient.java:1160) at org.apache.helix.manager.zk.zookeeper.ZkClient.delete(ZkClient.java:1215) at org.apache.helix.manager.zk.zookeeper.ZkClient.deleteRecursively(ZkClient.java:949) ... 6 moreCaused by: org.apache.zookeeper.KeeperException$NotEmptyException: KeeperErrorCode = Directory not empty for /GobblinYarnHelixApp/CONTROLLER at org.apache.zookeeper.KeeperException.create(KeeperException.java:128) at org.apache.zookeeper.KeeperException.create(KeeperException.java:54) at org.apache.zookeeper.ZooKeeper.delete(ZooKeeper.java:882) at org.apache.helix.manager.zk.zookeeper.ZkConnection.delete(ZkConnection.java:119) at org.apache.helix.manager.zk.zookeeper.ZkClient$9.call(ZkClient.java:1219) at org.apache.helix.manager.zk.zookeeper.ZkClient.retryUntilConnected(ZkClient.java:1150) ... 8 moreException in thread "Thread-6" org.apache.helix.HelixException: HelixManager (ZkClient) is not connected. Call HelixManager#connect() at org.apache.helix.manager.zk.ZKHelixManager.checkConnected(ZKHelixManager.java:363) at org.apache.helix.manager.zk.ZKHelixManager.getClusterManagmentTool(ZKHelixManager.java:908) at
[jira] [Created] (GOBBLIN-1204) stopping yarn does not stop application master or containers
Jay Sen created GOBBLIN-1204: Summary: stopping yarn does not stop application master or containers Key: GOBBLIN-1204 URL: https://issues.apache.org/jira/browse/GOBBLIN-1204 Project: Apache Gobblin Issue Type: Bug Affects Versions: 0.15.0 Reporter: Jay Sen Fix For: 0.15.0 The gobblin yarn command to stop the yarn mode should also stop the yarn application on hadoop including all its containers. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (GOBBLIN-1194) add Google cloud connector
Jay Sen created GOBBLIN-1194: Summary: add Google cloud connector Key: GOBBLIN-1194 URL: https://issues.apache.org/jira/browse/GOBBLIN-1194 Project: Apache Gobblin Issue Type: New Feature Affects Versions: 0.15.0 Reporter: Jay Sen Fix For: 0.15.0 This task is for addnig Google Cloud Connector in Gobblin to support writing t0 GCS. ref: [https://cloud.google.com/dataproc/docs/concepts/connectors/cloud-storage] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (GOBBLIN-1182) Incorrect datetime value error on mysql 5.7
Jay Sen created GOBBLIN-1182: Summary: Incorrect datetime value error on mysql 5.7 Key: GOBBLIN-1182 URL: https://issues.apache.org/jira/browse/GOBBLIN-1182 Project: Apache Gobblin Issue Type: Improvement Components: gobblin-core Affects Versions: 0.15.0 Reporter: Jay Sen Assignee: Abhishek Tiwari Fix For: 0.15.0 jobExecution record insertion with default value of epoch timestamp throws error on MySql 5.7+ the reason is MySQL 5.7+ will convert timestamp to local for internal storage and by default errors out anything less than epoch time unless server sets the zero or negative time allowed explicitly. as a quick fix, Gobblin can use 1970-01-02 00:00:00 which ensures all world's localtime is covered. alternate solution can be not to set the end_Date at all and allow null to indicate "not set". -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (GOBBLIN-1161) Hadoop 3.x support
Jay Sen created GOBBLIN-1161: Summary: Hadoop 3.x support Key: GOBBLIN-1161 URL: https://issues.apache.org/jira/browse/GOBBLIN-1161 Project: Apache Gobblin Issue Type: New Feature Reporter: Jay Sen To add support for hadoop 3.x There are already some breaking changes so we may have to brainstorm about how to add this support. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (GOBBLIN-1139) Basic Aerospike key value writer with Avro schema
[ https://issues.apache.org/jira/browse/GOBBLIN-1139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jay Sen updated GOBBLIN-1139: - Summary: Basic Aerospike key value writer with Avro schema (was: Basic Aerospike writer with Avro schema) > Basic Aerospike key value writer with Avro schema > - > > Key: GOBBLIN-1139 > URL: https://issues.apache.org/jira/browse/GOBBLIN-1139 > Project: Apache Gobblin > Issue Type: Sub-task >Reporter: Jay Sen >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (GOBBLIN-1138) AeroSpike Writer
Jay Sen created GOBBLIN-1138: Summary: AeroSpike Writer Key: GOBBLIN-1138 URL: https://issues.apache.org/jira/browse/GOBBLIN-1138 Project: Apache Gobblin Issue Type: New Feature Components: gobblin-connectors Affects Versions: 0.15.0 Reporter: Jay Sen Assignee: Shirshanka Das Fix For: 0.15.0 Aerospike is a very efficient key value store with high throughput. This story is to add Aerospike Writer to Gobblin. This should have features like following. # Basic key value writer # Abstract writer with implementation with support for Avro schema # Sync/Async write modes # throttling integration # unsecure and secure connection handling # error & retries handling ( retry queue for data delivery SLA ) # metrics integration specific to the writer This would be phased development with subtasks. Please feel free to coordinate for contribution. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (GOBBLIN-1139) Basic Aerospike writer with Avro schema
Jay Sen created GOBBLIN-1139: Summary: Basic Aerospike writer with Avro schema Key: GOBBLIN-1139 URL: https://issues.apache.org/jira/browse/GOBBLIN-1139 Project: Apache Gobblin Issue Type: Sub-task Reporter: Jay Sen -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (GOBBLIN-1128) Gobblin JobStore on MySQL with CRUD API
Jay Sen created GOBBLIN-1128: Summary: Gobblin JobStore on MySQL with CRUD API Key: GOBBLIN-1128 URL: https://issues.apache.org/jira/browse/GOBBLIN-1128 Project: Apache Gobblin Issue Type: Improvement Components: gobblin-api, gobblin-config Affects Versions: 0.15.0 Reporter: Jay Sen Assignee: Hung Tran Fix For: 0.15.0 [https://cwiki.apache.org/confluence/display/GOBBLIN/GIP+4%3A+MySQL+backed+job+config+store] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (GOBBLIN-1128) Gobblin JobStore on MySQL with CRUD API
[ https://issues.apache.org/jira/browse/GOBBLIN-1128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jay Sen updated GOBBLIN-1128: - Description: [details: https://cwiki.apache.org/confluence/display/GOBBLIN/GIP+4%3A+MySQL+backed+job+config+store|https://cwiki.apache.org/confluence/display/GOBBLIN/GIP+4%3A+MySQL+backed+job+config+store] was:[https://cwiki.apache.org/confluence/display/GOBBLIN/GIP+4%3A+MySQL+backed+job+config+store] Issue Type: New Feature (was: Improvement) > Gobblin JobStore on MySQL with CRUD API > --- > > Key: GOBBLIN-1128 > URL: https://issues.apache.org/jira/browse/GOBBLIN-1128 > Project: Apache Gobblin > Issue Type: New Feature > Components: gobblin-api, gobblin-config >Affects Versions: 0.15.0 >Reporter: Jay Sen >Assignee: Hung Tran >Priority: Major > Fix For: 0.15.0 > > > [details: > https://cwiki.apache.org/confluence/display/GOBBLIN/GIP+4%3A+MySQL+backed+job+config+store|https://cwiki.apache.org/confluence/display/GOBBLIN/GIP+4%3A+MySQL+backed+job+config+store] > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (GOBBLIN-1095) jcenter only supports https now
[ https://issues.apache.org/jira/browse/GOBBLIN-1095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jay Sen closed GOBBLIN-1095. Resolution: Duplicate > jcenter only supports https now > --- > > Key: GOBBLIN-1095 > URL: https://issues.apache.org/jira/browse/GOBBLIN-1095 > Project: Apache Gobblin > Issue Type: Improvement >Affects Versions: 0.15.0 >Reporter: Jay Sen >Priority: Critical > Fix For: 0.15.0 > > > [http://jcenter.bintray.com/] repo no longer works to download libraries. > need to update it to https now. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (GOBBLIN-1095) jcenter only supports https now
Jay Sen created GOBBLIN-1095: Summary: jcenter only supports https now Key: GOBBLIN-1095 URL: https://issues.apache.org/jira/browse/GOBBLIN-1095 Project: Apache Gobblin Issue Type: Improvement Affects Versions: 0.15.0 Reporter: Jay Sen Fix For: 0.15.0 [http://jcenter.bintray.com/] repo no longer works to download libraries. need to update it to https now. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (GOBBLIN-984) Get Gobblin log outputting to work
[ https://issues.apache.org/jira/browse/GOBBLIN-984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16983968#comment-16983968 ] Jay Sen commented on GOBBLIN-984: - Pls feel free to revert. > Get Gobblin log outputting to work > -- > > Key: GOBBLIN-984 > URL: https://issues.apache.org/jira/browse/GOBBLIN-984 > Project: Apache Gobblin > Issue Type: Bug >Reporter: William Lo >Priority: Minor > Time Spent: 40m > Remaining Estimate: 0h > > [GOBBLIN-822]|[https://github.com/apache/incubator-gobblin/pull/2685] > Introduced issues with the scripts not outputting logs. Need to include > log4j2.xml in each service's `conf` folder -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (GOBBLIN-984) Get Gobblin log outputting to work
[ https://issues.apache.org/jira/browse/GOBBLIN-984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16983967#comment-16983967 ] Jay Sen commented on GOBBLIN-984: - Sure. Didnt know you guys r using master branch. I will setup sometime to talk about this. Thanks > Get Gobblin log outputting to work > -- > > Key: GOBBLIN-984 > URL: https://issues.apache.org/jira/browse/GOBBLIN-984 > Project: Apache Gobblin > Issue Type: Bug >Reporter: William Lo >Priority: Minor > Time Spent: 40m > Remaining Estimate: 0h > > [GOBBLIN-822]|[https://github.com/apache/incubator-gobblin/pull/2685] > Introduced issues with the scripts not outputting logs. Need to include > log4j2.xml in each service's `conf` folder -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (GOBBLIN-984) Get Gobblin log outputting to work
[ https://issues.apache.org/jira/browse/GOBBLIN-984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16983963#comment-16983963 ] Jay Sen commented on GOBBLIN-984: - Can we meet later today anytime after 4pm? I will send you the zoom link to join. Thanks > Get Gobblin log outputting to work > -- > > Key: GOBBLIN-984 > URL: https://issues.apache.org/jira/browse/GOBBLIN-984 > Project: Apache Gobblin > Issue Type: Bug >Reporter: William Lo >Priority: Minor > Time Spent: 40m > Remaining Estimate: 0h > > [GOBBLIN-822]|[https://github.com/apache/incubator-gobblin/pull/2685] > Introduced issues with the scripts not outputting logs. Need to include > log4j2.xml in each service's `conf` folder -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (GOBBLIN-984) Get Gobblin log outputting to work
[ https://issues.apache.org/jira/browse/GOBBLIN-984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16983958#comment-16983958 ] Jay Sen commented on GOBBLIN-984: - We can work that out with slf4j integration Lets try to figure it out. Do u want to meet up online? > Get Gobblin log outputting to work > -- > > Key: GOBBLIN-984 > URL: https://issues.apache.org/jira/browse/GOBBLIN-984 > Project: Apache Gobblin > Issue Type: Bug >Reporter: William Lo >Priority: Minor > Time Spent: 40m > Remaining Estimate: 0h > > [GOBBLIN-822]|[https://github.com/apache/incubator-gobblin/pull/2685] > Introduced issues with the scripts not outputting logs. Need to include > log4j2.xml in each service's `conf` folder -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (GOBBLIN-985) upgrade Guava libs
Jay Sen created GOBBLIN-985: --- Summary: upgrade Guava libs Key: GOBBLIN-985 URL: https://issues.apache.org/jira/browse/GOBBLIN-985 Project: Apache Gobblin Issue Type: Sub-task Affects Versions: 0.14.0 Reporter: Jay Sen Fix For: 0.15.0 Gobblin is using very old guava lib ( 0.15 is from 2013 ), the latest is 0.26 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (GOBBLIN-984) Get Gobblin log outputting to work
[ https://issues.apache.org/jira/browse/GOBBLIN-984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16982927#comment-16982927 ] Jay Sen commented on GOBBLIN-984: - created this PR: [https://github.com/apache/incubator-gobblin/pull/2830] we also need this PR to work together to standardize all other config and env issues : [https://github.com/apache/incubator-gobblin/pull/2788] > Get Gobblin log outputting to work > -- > > Key: GOBBLIN-984 > URL: https://issues.apache.org/jira/browse/GOBBLIN-984 > Project: Apache Gobblin > Issue Type: Bug >Reporter: William Lo >Priority: Minor > Time Spent: 10m > Remaining Estimate: 0h > > [GOBBLIN-822]|[https://github.com/apache/incubator-gobblin/pull/2685] > Introduced issues with the scripts not outputting logs. Need to include > log4j2.xml in each service's `conf` folder -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (GOBBLIN-984) Get Gobblin log outputting to work
[ https://issues.apache.org/jira/browse/GOBBLIN-984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16982766#comment-16982766 ] Jay Sen commented on GOBBLIN-984: - Sure, I will create PR in sometime, Thanks > Get Gobblin log outputting to work > -- > > Key: GOBBLIN-984 > URL: https://issues.apache.org/jira/browse/GOBBLIN-984 > Project: Apache Gobblin > Issue Type: Bug >Reporter: William Lo >Priority: Minor > > [GOBBLIN-822]|[https://github.com/apache/incubator-gobblin/pull/2685] > Introduced issues with the scripts not outputting logs. Need to include > log4j2.xml in each service's `conf` folder -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (GOBBLIN-984) Get Gobblin log outputting to work
[ https://issues.apache.org/jira/browse/GOBBLIN-984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16982753#comment-16982753 ] Jay Sen edited comment on GOBBLIN-984 at 11/26/19 6:31 PM: --- Hi [~wlo], I have this ready which i was using it on my local, missed to include those files, i can create PR, pls let me know. THanks was (Author: jaysen): Hi [~wlo], I have this ready which i was using it on my local, i can create PR, pls let me know. THanks > Get Gobblin log outputting to work > -- > > Key: GOBBLIN-984 > URL: https://issues.apache.org/jira/browse/GOBBLIN-984 > Project: Apache Gobblin > Issue Type: Bug >Reporter: William Lo >Priority: Minor > > [GOBBLIN-822]|[https://github.com/apache/incubator-gobblin/pull/2685] > Introduced issues with the scripts not outputting logs. Need to include > log4j2.xml in each service's `conf` folder -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (GOBBLIN-984) Get Gobblin log outputting to work
[ https://issues.apache.org/jira/browse/GOBBLIN-984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16982753#comment-16982753 ] Jay Sen commented on GOBBLIN-984: - Hi [~wlo], I have this ready, i can create PR, pls let me know. THanks > Get Gobblin log outputting to work > -- > > Key: GOBBLIN-984 > URL: https://issues.apache.org/jira/browse/GOBBLIN-984 > Project: Apache Gobblin > Issue Type: Bug >Reporter: William Lo >Priority: Minor > > [GOBBLIN-822]|[https://github.com/apache/incubator-gobblin/pull/2685] > Introduced issues with the scripts not outputting logs. Need to include > log4j2.xml in each service's `conf` folder -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (GOBBLIN-984) Get Gobblin log outputting to work
[ https://issues.apache.org/jira/browse/GOBBLIN-984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16982753#comment-16982753 ] Jay Sen edited comment on GOBBLIN-984 at 11/26/19 6:30 PM: --- Hi [~wlo], I have this ready which i was using it on my local, i can create PR, pls let me know. THanks was (Author: jaysen): Hi [~wlo], I have this ready, i can create PR, pls let me know. THanks > Get Gobblin log outputting to work > -- > > Key: GOBBLIN-984 > URL: https://issues.apache.org/jira/browse/GOBBLIN-984 > Project: Apache Gobblin > Issue Type: Bug >Reporter: William Lo >Priority: Minor > > [GOBBLIN-822]|[https://github.com/apache/incubator-gobblin/pull/2685] > Introduced issues with the scripts not outputting logs. Need to include > log4j2.xml in each service's `conf` folder -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Issue Comment Deleted] (GOBBLIN-824) upgrade component and dependency library versions
[ https://issues.apache.org/jira/browse/GOBBLIN-824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jay Sen updated GOBBLIN-824: Comment: was deleted (was: GOBBLIN-818 does minor upgrades.) > upgrade component and dependency library versions > - > > Key: GOBBLIN-824 > URL: https://issues.apache.org/jira/browse/GOBBLIN-824 > Project: Apache Gobblin > Issue Type: Improvement >Affects Versions: 0.14.0 >Reporter: Jay Sen >Priority: Major > Fix For: 0.15.0 > > > lot of libs are old, like hadoop, hive, etc... > it wont be easy to just comile gobblin with new version via passing new > version on command line, there is lot of changes since last couple of years. > Gobblin should use latest versions > Hadoop: 2.9.x > hive : 2.3.5 > pegasus: 24.0.2 > Avro : 1.8.2 > etc... > please feel free to mention which lib should be updated as part of this > overall upgrade process. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (GOBBLIN-824) upgrade component and dependency library versions
[ https://issues.apache.org/jira/browse/GOBBLIN-824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jay Sen updated GOBBLIN-824: Summary: upgrade component and dependency library versions (was: upgrade to latest libraries in Gobblin) > upgrade component and dependency library versions > - > > Key: GOBBLIN-824 > URL: https://issues.apache.org/jira/browse/GOBBLIN-824 > Project: Apache Gobblin > Issue Type: Improvement >Affects Versions: 0.14.0 >Reporter: Jay Sen >Priority: Major > Fix For: 0.15.0 > > > lot of libs are old, like hadoop, hive, etc... > it wont be easy to just comile gobblin with new version via passing new > version on command line, there is lot of changes since last couple of years. > Gobblin should use latest versions > Hadoop: 2.9.x > hive : 2.3.5 > pegasus: 24.0.2 > Avro : 1.8.2 > etc... > please feel free to mention which lib should be updated as part of this > overall upgrade process. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (GOBBLIN-741) update influxdb-java dependency to latest (> 2.15)
[ https://issues.apache.org/jira/browse/GOBBLIN-741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jay Sen updated GOBBLIN-741: Parent: GOBBLIN-824 Issue Type: Sub-task (was: Improvement) > update influxdb-java dependency to latest (> 2.15) > -- > > Key: GOBBLIN-741 > URL: https://issues.apache.org/jira/browse/GOBBLIN-741 > Project: Apache Gobblin > Issue Type: Sub-task >Reporter: Jay Sen >Priority: Major > > current influxdb-java dependency is 2.1 which was released in Dec,2015 > this can bring new functionalities of altest influxDB and new ways of testing > influxDB integration witihin Gobblin. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (GOBBLIN-823) upgrade or remove lombok usage
[ https://issues.apache.org/jira/browse/GOBBLIN-823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jay Sen updated GOBBLIN-823: Parent: GOBBLIN-824 Issue Type: Sub-task (was: Improvement) > upgrade or remove lombok usage > -- > > Key: GOBBLIN-823 > URL: https://issues.apache.org/jira/browse/GOBBLIN-823 > Project: Apache Gobblin > Issue Type: Sub-task >Affects Versions: 0.14.0 >Reporter: Jay Sen >Priority: Major > Fix For: 0.15.0 > > > Lombok is useful once developer gets used to it, but it has itss own learning > curve for new user, given lombok adds unnecessary complexity, mot of the > things can be taken care by smart IDE anyway, I suggest we remove use of > lomboke. > if not, lets at least upgrade to latest version to get all bug fixes. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (GOBBLIN-979) upgrade Hive version
[ https://issues.apache.org/jira/browse/GOBBLIN-979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jay Sen updated GOBBLIN-979: Parent: GOBBLIN-824 Issue Type: Sub-task (was: Bug) > upgrade Hive version > > > Key: GOBBLIN-979 > URL: https://issues.apache.org/jira/browse/GOBBLIN-979 > Project: Apache Gobblin > Issue Type: Sub-task > Components: hive-registration >Affects Versions: 0.14.0 >Reporter: Jay Sen >Priority: Major > Fix For: 0.15.0 > > > Gobblin uses old hive 1.0, compiling against Hive 1.2 is not compatible due > to backward incompatible changes in Hive 1.2 > Also there are many changes in Hive 2.x and 3.x, that we need to think about > on how to provide the support for newer Hive versions in Gobblin -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (GOBBLIN-818) upgrade default hadoop versions to 2.7.x
[ https://issues.apache.org/jira/browse/GOBBLIN-818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jay Sen updated GOBBLIN-818: Parent: GOBBLIN-824 Issue Type: Sub-task (was: Improvement) > upgrade default hadoop versions to 2.7.x > > > Key: GOBBLIN-818 > URL: https://issues.apache.org/jira/browse/GOBBLIN-818 > Project: Apache Gobblin > Issue Type: Sub-task >Reporter: Jay Sen >Priority: Major > > Gobblin should also upgrade Hadoop from 2.3 to 2.7.x at least, which is very > stable. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (GOBBLIN-979) upgrade Hive version
[ https://issues.apache.org/jira/browse/GOBBLIN-979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jay Sen updated GOBBLIN-979: Parent: (was: GOBBLIN-818) Issue Type: Bug (was: Sub-task) > upgrade Hive version > > > Key: GOBBLIN-979 > URL: https://issues.apache.org/jira/browse/GOBBLIN-979 > Project: Apache Gobblin > Issue Type: Bug > Components: hive-registration >Affects Versions: 0.14.0 >Reporter: Jay Sen >Priority: Major > Fix For: 0.15.0 > > > Gobblin uses old hive 1.0, compiling against Hive 1.2 is not compatible due > to backward incompatible changes in Hive 1.2 > Also there are many changes in Hive 2.x and 3.x, that we need to think about > on how to provide the support for newer Hive versions in Gobblin -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (GOBBLIN-979) upgrade Hive version
[ https://issues.apache.org/jira/browse/GOBBLIN-979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jay Sen updated GOBBLIN-979: Parent: GOBBLIN-818 Issue Type: Sub-task (was: Bug) > upgrade Hive version > > > Key: GOBBLIN-979 > URL: https://issues.apache.org/jira/browse/GOBBLIN-979 > Project: Apache Gobblin > Issue Type: Sub-task > Components: hive-registration >Affects Versions: 0.14.0 >Reporter: Jay Sen >Priority: Major > Fix For: 0.15.0 > > > Gobblin uses old hive 1.0, compiling against Hive 1.2 is not compatible due > to backward incompatible changes in Hive 1.2 > Also there are many changes in Hive 2.x and 3.x, that we need to think about > on how to provide the support for newer Hive versions in Gobblin -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (GOBBLIN-979) upgrade Hive version
Jay Sen created GOBBLIN-979: --- Summary: upgrade Hive version Key: GOBBLIN-979 URL: https://issues.apache.org/jira/browse/GOBBLIN-979 Project: Apache Gobblin Issue Type: Bug Reporter: Jay Sen Gobblin uses old hive 1.0, compiling against Hive 1.2 is not compatible due to backward incompatible changes in Hive 1.2 Also there are many changes in Hive 2.x and 3.x, that we need to think about on how to provide the support for newer Hive versions in Gobblin -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (GOBBLIN-979) upgrade Hive version
[ https://issues.apache.org/jira/browse/GOBBLIN-979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jay Sen updated GOBBLIN-979: Component/s: hive-registration Fix Version/s: 0.15.0 Affects Version/s: 0.14.0 > upgrade Hive version > > > Key: GOBBLIN-979 > URL: https://issues.apache.org/jira/browse/GOBBLIN-979 > Project: Apache Gobblin > Issue Type: Bug > Components: hive-registration >Affects Versions: 0.14.0 >Reporter: Jay Sen >Priority: Major > Fix For: 0.15.0 > > > Gobblin uses old hive 1.0, compiling against Hive 1.2 is not compatible due > to backward incompatible changes in Hive 1.2 > Also there are many changes in Hive 2.x and 3.x, that we need to think about > on how to provide the support for newer Hive versions in Gobblin -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (GOBBLIN-818) upgrade default hadoop versions to 2.7.x
[ https://issues.apache.org/jira/browse/GOBBLIN-818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jay Sen updated GOBBLIN-818: Description: Gobblin should also upgrade Hadoop from 2.3 to 2.7.x at least, which is very stable. (was: Gobblin uses old hive 1.0, compiling against Hive 1.2 is not compatible due to backward incompatible changes in Hive 1.2. Gobblin should also upgrade Hadoop from 2.3 to 2.7.7 at least which is very stable. ) > upgrade default hadoop versions to 2.7.x > > > Key: GOBBLIN-818 > URL: https://issues.apache.org/jira/browse/GOBBLIN-818 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Jay Sen >Priority: Major > > Gobblin should also upgrade Hadoop from 2.3 to 2.7.x at least, which is very > stable. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (GOBBLIN-818) upgrade default hadoop versions to 2.7.x
[ https://issues.apache.org/jira/browse/GOBBLIN-818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jay Sen updated GOBBLIN-818: Summary: upgrade default hadoop versions to 2.7.x (was: upgrade default hadoop versions to 2.7.x and hive version to 1.2) > upgrade default hadoop versions to 2.7.x > > > Key: GOBBLIN-818 > URL: https://issues.apache.org/jira/browse/GOBBLIN-818 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Jay Sen >Priority: Major > > Gobblin uses old hive 1.0, compiling against Hive 1.2 is not compatible due > to backward incompatible changes in Hive 1.2. > Gobblin should also upgrade Hadoop from 2.3 to 2.7.7 at least which is very > stable. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (GOBBLIN-959) gobblin-http module fails to compile due to missing dependency
Jay Sen created GOBBLIN-959: --- Summary: gobblin-http module fails to compile due to missing dependency Key: GOBBLIN-959 URL: https://issues.apache.org/jira/browse/GOBBLIN-959 Project: Apache Gobblin Issue Type: Bug Affects Versions: 0.14.0 Reporter: Jay Sen Fix For: 0.15.0 {{you can see folllowing error while compiling this module }} {{./gradlew :gobblin-module:gobblin-http:build -x findbugsMain -x rat -x checkstyleMain}} {{> Task :gobblin-modules:gobblin-http:compileTestJava FAILED}} {{/Users/jsenjaliya/src/apache/gobblin/gobblin-modules/gobblin-http/src/test/java/org/apache/gobblin/util/HttpUtilsTest.java:28: error: package junit.framework does not exist}} {{import junit.framework.Assert;}} {{ ^}} {{its basically missing the junit dependency in gradle file}} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (GOBBLIN-952) PathAlerationListner needs to picks up only modified job
Jay Sen created GOBBLIN-952: --- Summary: PathAlerationListner needs to picks up only modified job Key: GOBBLIN-952 URL: https://issues.apache.org/jira/browse/GOBBLIN-952 Project: Apache Gobblin Issue Type: Bug Affects Versions: 0.14.0 Reporter: Jay Sen Fix For: 0.15.0 {{PathAlterationListenerAdaptorForMonitor}} detects any changes to the job config path, which is the required functionality but it also literally tried to schedule or unschedule all the changes. In case the job runs only once (by config), it will move it to .done file which it then tries to unschedule it. May be the monitor needs additional logic to ignore some changes like this. here is the log {code:java} 22:57:38.803 INFO: [TaskExecutor STOPPING-97] [org.apache.gobblin.runtime.TaskExecutor] TaskExecutor - Successfully shutdown ExecutorService: com.google.common.util.concurrent.MoreExecutors$ScheduledListeningDecorator@6b5a367e 22:57:38.803 INFO: [TaskExecutor STOPPING-97] [org.apache.gobblin.runtime.TaskExecutor] TaskExecutor - Attempting to shutdown ExecutorService: com.google.common.util.concurrent.MoreExecutors$ListeningDecorator@659dd565 22:57:38.803 INFO: [TaskExecutor STOPPING-97] [org.apache.gobblin.runtime.TaskExecutor] TaskExecutor - Successfully shutdown ExecutorService: com.google.common.util.concurrent.MoreExecutors$ListeningDecorator@659dd565 22:57:38.803 INFO: [LocalTaskStateTracker STOPPING-98] [org.apache.gobblin.runtime.local.LocalTaskStateTracker] LocalTaskStateTracker - Stopping the task state tracker 22:57:38.803 INFO: [LocalTaskStateTracker STOPPING-98] [org.apache.gobblin.runtime.local.LocalTaskStateTracker] LocalTaskStateTracker - Attempting to shutdown ExecutorService: com.google.common.util.concurrent.MoreExecutors$ScheduledListeningDecorator@5faf4e52 22:57:38.804 INFO: [LocalTaskStateTracker STOPPING-98] [org.apache.gobblin.runtime.local.LocalTaskStateTracker] LocalTaskStateTracker - Successfully shutdown ExecutorService: com.google.common.util.concurrent.MoreExecutors$ScheduledListeningDecorator@5faf4e52 22:57:38.804 INFO: [JobScheduler-2-70] [org.apache.gobblin.source.extractor.filebased.FileBasedSource] FileBasedSource - Shutting down the FileSystemHelper connection 22:57:56.476 INFO: [newDaemonThreadFactory-26] [org.apache.gobblin.scheduler.PathAlterationListenerAdaptorForMonitor] PathAlterationListenerAdaptorForMonitor - Detected deletion of job configuration file file:/home/jsenjaliya/gobblin-dist/gobblin-jobs/local-aerospike-avro.pull 22:57:56.476 INFO: [newDaemonThreadFactory-26] [org.apache.gobblin.scheduler.PathAlterationListenerAdaptorForMonitor] PathAlterationListenerAdaptorForMonitor - Could not find a scheduled job to unschedule with path /home/jsenjaliya/gobblin-dist/gobblin-jobs/local-aerospike-avro.pull {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (GOBBLIN-949) add dry run in gobblin.sh script
Jay Sen created GOBBLIN-949: --- Summary: add dry run in gobblin.sh script Key: GOBBLIN-949 URL: https://issues.apache.org/jira/browse/GOBBLIN-949 Project: Apache Gobblin Issue Type: Bug Affects Versions: 0.14.0 Reporter: Jay Sen Fix For: 0.15.0 many times we may just need to see what is going to be executed without actually executing it., i.e. dry run. which also helps testing the script itself. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (GOBBLIN-943) remove pid file only when it exists
Jay Sen created GOBBLIN-943: --- Summary: remove pid file only when it exists Key: GOBBLIN-943 URL: https://issues.apache.org/jira/browse/GOBBLIN-943 Project: Apache Gobblin Issue Type: Sub-task Reporter: Jay Sen -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (GOBBLIN-942) use HOCON for job configurations instead of Java Properties
[ https://issues.apache.org/jira/browse/GOBBLIN-942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jay Sen updated GOBBLIN-942: Summary: use HOCON for job configurations instead of Java Properties (was: HOCON vs Java Prop for job configurations) > use HOCON for job configurations instead of Java Properties > --- > > Key: GOBBLIN-942 > URL: https://issues.apache.org/jira/browse/GOBBLIN-942 > Project: Apache Gobblin > Issue Type: Sub-task >Reporter: Jay Sen >Priority: Major > > currently {{PullFileLoader}} uses java property loader for *.pull & *.job > files and uses hocon loader for *.json & *.conf files. > This introduces lot of inconsistencies among how platform config gets treated > vs job config. I guess previously we used the \{env:***} format for using the > env variables but with GOBBLIN-939 it makes it HOCON compatible. > the current issue with treating job config file as java property file is that > it addes quote around the already quoted string which HOCON handles well. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (GOBBLIN-942) HOCON vs Java Prop for job configurations
Jay Sen created GOBBLIN-942: --- Summary: HOCON vs Java Prop for job configurations Key: GOBBLIN-942 URL: https://issues.apache.org/jira/browse/GOBBLIN-942 Project: Apache Gobblin Issue Type: Sub-task Reporter: Jay Sen currently {{PullFileLoader}} uses java property loader for *.pull & *.job files and uses hocon loader for *.json & *.conf files. This introduces lot of inconsistencies among how platform config gets treated vs job config. I guess previously we used the \{env:***} format for using the env variables but with GOBBLIN-939 it makes it HOCON compatible. the current issue with treating job config file as java property file is that it addes quote around the already quoted string which HOCON handles well. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (GOBBLIN-939) Integrate usage of env variables in gobblin scripts and configs
[ https://issues.apache.org/jira/browse/GOBBLIN-939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jay Sen updated GOBBLIN-939: Description: 1. standardize config with ENV variables 2. define default env variables in gobblin-env.sh was:standardize the ability to override gobblin variables via env variable with new gobblin.sh script. > Integrate usage of env variables in gobblin scripts and configs > --- > > Key: GOBBLIN-939 > URL: https://issues.apache.org/jira/browse/GOBBLIN-939 > Project: Apache Gobblin > Issue Type: Improvement >Affects Versions: 0.14.0 >Reporter: Jay Sen >Priority: Major > Fix For: 0.15.0 > > > 1. standardize config with ENV variables > 2. define default env variables in gobblin-env.sh -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (GOBBLIN-939) Integrate usage of env variables in gobblin scripts and configs
[ https://issues.apache.org/jira/browse/GOBBLIN-939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jay Sen updated GOBBLIN-939: Summary: Integrate usage of env variables in gobblin scripts and configs (was: standardize the ability to override gobblin variables via env variable with new gobblin.sh script) > Integrate usage of env variables in gobblin scripts and configs > --- > > Key: GOBBLIN-939 > URL: https://issues.apache.org/jira/browse/GOBBLIN-939 > Project: Apache Gobblin > Issue Type: Improvement >Affects Versions: 0.14.0 >Reporter: Jay Sen >Priority: Major > Fix For: 0.15.0 > > > standardize the ability to override gobblin variables via env variable with > new gobblin.sh script. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (GOBBLIN-939) standardize the ability to override gobblin variables via env variable with new gobblin.sh script
Jay Sen created GOBBLIN-939: --- Summary: standardize the ability to override gobblin variables via env variable with new gobblin.sh script Key: GOBBLIN-939 URL: https://issues.apache.org/jira/browse/GOBBLIN-939 Project: Apache Gobblin Issue Type: Improvement Affects Versions: 0.14.0 Reporter: Jay Sen Fix For: 0.15.0 standardize the ability to override gobblin variables via env variable with new gobblin.sh script. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (GOBBLIN-937) fix help text and align it with variable names
Jay Sen created GOBBLIN-937: --- Summary: fix help text and align it with variable names Key: GOBBLIN-937 URL: https://issues.apache.org/jira/browse/GOBBLIN-937 Project: Apache Gobblin Issue Type: Sub-task Affects Versions: 0.14.0 Reporter: Jay Sen Fix For: 0.15.0 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (GOBBLIN-936) check id job disabled first before parsing and scheduling job
Jay Sen created GOBBLIN-936: --- Summary: check id job disabled first before parsing and scheduling job Key: GOBBLIN-936 URL: https://issues.apache.org/jira/browse/GOBBLIN-936 Project: Apache Gobblin Issue Type: Improvement Components: gobblin-core Affects Versions: 0.14.0 Reporter: Jay Sen Assignee: Abhishek Tiwari Fix For: 0.15.0 if job is disabled, Gobblin should not even try to parse the config and do lot of stuff. currently it checks this at very end of scheduling job it should be first thing to check and skip its disabled. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (GOBBLIN-935) reloading job config throws NPE
[ https://issues.apache.org/jira/browse/GOBBLIN-935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jay Sen closed GOBBLIN-935. --- Resolution: Duplicate > reloading job config throws NPE > > > Key: GOBBLIN-935 > URL: https://issues.apache.org/jira/browse/GOBBLIN-935 > Project: Apache Gobblin > Issue Type: Improvement > Components: gobblin-core >Affects Versions: 0.14.0 >Reporter: Jay Sen >Assignee: Abhishek Tiwari >Priority: Major > Fix For: 0.15.0 > > > steps to reproduce > # run the gobblin in standalone ( i used standalone more but all other mode > should be same) > # wait for any job to finish successfully or unsuccessfully. > # rename the completed job from .pull.done to .pull OR > make any changes to the existing .pull file ( like adding space ) > # the job will be picked up by the observer and will throw NPE on > {{SchedulerUtils}}.{{resolveTemplate}} > This is due to the NULL JobSpecResolver. which happens only in case of the > reloading of the job config and this does not happen in regular flow since it > takes different code path. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (GOBBLIN-935) reloading job config throws NPE
[ https://issues.apache.org/jira/browse/GOBBLIN-935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16963199#comment-16963199 ] Jay Sen commented on GOBBLIN-935: - I just saw, once i rebased the repo, that @William Lo beat me on this bug to get to it first. Thanks @william Lo for fixing this. > reloading job config throws NPE > > > Key: GOBBLIN-935 > URL: https://issues.apache.org/jira/browse/GOBBLIN-935 > Project: Apache Gobblin > Issue Type: Improvement > Components: gobblin-core >Affects Versions: 0.14.0 >Reporter: Jay Sen >Assignee: Abhishek Tiwari >Priority: Major > Fix For: 0.15.0 > > > steps to reproduce > # run the gobblin in standalone ( i used standalone more but all other mode > should be same) > # wait for any job to finish successfully or unsuccessfully. > # rename the completed job from .pull.done to .pull OR > make any changes to the existing .pull file ( like adding space ) > # the job will be picked up by the observer and will throw NPE on > {{SchedulerUtils}}.{{resolveTemplate}} > This is due to the NULL JobSpecResolver. which happens only in case of the > reloading of the job config and this does not happen in regular flow since it > takes different code path. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (GOBBLIN-935) reloading job config throws NPE
Jay Sen created GOBBLIN-935: --- Summary: reloading job config throws NPE Key: GOBBLIN-935 URL: https://issues.apache.org/jira/browse/GOBBLIN-935 Project: Apache Gobblin Issue Type: Improvement Components: gobblin-core Affects Versions: 0.14.0 Reporter: Jay Sen Assignee: Abhishek Tiwari Fix For: 0.15.0 steps to reproduce # run the gobblin in standalone ( i used standalone more but all other mode should be same) # wait for any job to finish successfully or unsuccessfully. # rename the completed job from .pull.done to .pull OR make any changes to the existing .pull file ( like adding space ) # the job will be picked up by the observer and will throw NPE on {{SchedulerUtils}}.{{resolveTemplate}} This is due to the NULL JobSpecResolver. which happens only in case of the reloading of the job config and this does not happen in regular flow since it takes different code path. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (GOBBLIN-934) scheduling job logic is broken in PathAlterationListenerAdaptorForMonitor
Jay Sen created GOBBLIN-934: --- Summary: scheduling job logic is broken in PathAlterationListenerAdaptorForMonitor Key: GOBBLIN-934 URL: https://issues.apache.org/jira/browse/GOBBLIN-934 Project: Apache Gobblin Issue Type: Improvement Components: gobblin-core Affects Versions: 0.14.0 Reporter: Jay Sen Assignee: Abhishek Tiwari Fix For: 0.15.0 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (GOBBLIN-929) create ZookeeperCommitSequenceStore
Jay Sen created GOBBLIN-929: --- Summary: create ZookeeperCommitSequenceStore Key: GOBBLIN-929 URL: https://issues.apache.org/jira/browse/GOBBLIN-929 Project: Apache Gobblin Issue Type: Improvement Components: gobblin-api, gobblin-core Affects Versions: 0.14.0 Reporter: Jay Sen Assignee: Hung Tran Fix For: 0.15.0 currently for {{EXACTLY_ONCE}} semantics, only file system based {{FsCommitSequenceStore}} is available, which can be slow for streaming use-cases. should implement ZK based checkpointing for faster service. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (GOBBLIN-927) config to properties fails to convert configObject to string
[ https://issues.apache.org/jira/browse/GOBBLIN-927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jay Sen updated GOBBLIN-927: Description: if you specify source.schema = \{\} then {{ConfigUtils.configToProperties}} would fail to convert this object directly to string. solution is to use convert any possible objects to string. was: if you specify source.schema = { } then {{ConfigUtils.configToProperties}} would fail to convert this object directly to string. solution is to use convert any possible objects to string. > config to properties fails to convert configObject to string > > > Key: GOBBLIN-927 > URL: https://issues.apache.org/jira/browse/GOBBLIN-927 > Project: Apache Gobblin > Issue Type: Bug >Reporter: Jay Sen >Priority: Minor > > if you specify source.schema = \{\} then > {{ConfigUtils.configToProperties}} would fail to convert this object directly > to string. > solution is to use convert any possible objects to string. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (GOBBLIN-927) config to properties fails to convert configObject to string
Jay Sen created GOBBLIN-927: --- Summary: config to properties fails to convert configObject to string Key: GOBBLIN-927 URL: https://issues.apache.org/jira/browse/GOBBLIN-927 Project: Apache Gobblin Issue Type: Bug Reporter: Jay Sen if you specify source.schema = { } then {{ConfigUtils.configToProperties}} would fail to convert this object directly to string. solution is to use convert any possible objects to string. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (GOBBLIN-926) use typesafe config consistently everywhere
Jay Sen created GOBBLIN-926: --- Summary: use typesafe config consistently everywhere Key: GOBBLIN-926 URL: https://issues.apache.org/jira/browse/GOBBLIN-926 Project: Apache Gobblin Issue Type: Improvement Affects Versions: 0.14.0 Reporter: Jay Sen Fix For: 0.15.0 basically remove the usage and conversion to java.utils.properties. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (GOBBLIN-920) simple json example is missing Apache Commons VFS dep.
Jay Sen created GOBBLIN-920: --- Summary: simple json example is missing Apache Commons VFS dep. Key: GOBBLIN-920 URL: https://issues.apache.org/jira/browse/GOBBLIN-920 Project: Apache Gobblin Issue Type: Bug Reporter: Jay Sen when you run simple json example from gobblin-example module in mapreduce mode, it reports error of missing apache common vfs dependency. solution is to include the dependency in gobblin-example so that it created in lib to be used when running the example. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (GOBBLIN-901) add some info logging to FsDataWriter
Jay Sen created GOBBLIN-901: --- Summary: add some info logging to FsDataWriter Key: GOBBLIN-901 URL: https://issues.apache.org/jira/browse/GOBBLIN-901 Project: Apache Gobblin Issue Type: Bug Components: gobblin-core Affects Versions: 0.14.0 Reporter: Jay Sen Assignee: Abhishek Tiwari Fix For: 0.15.0 When moving the staging data to final output dir, sometime it fails due to various reasons and its hard to debug without proper info. adding some more logging would help debug the issue, specially for new user. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (GOBBLIN-900) FsDataWriter dont use writer config for FileSystem
[ https://issues.apache.org/jira/browse/GOBBLIN-900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jay Sen closed GOBBLIN-900. --- Resolution: Duplicate > FsDataWriter dont use writer config for FileSystem > -- > > Key: GOBBLIN-900 > URL: https://issues.apache.org/jira/browse/GOBBLIN-900 > Project: Apache Gobblin > Issue Type: Bug > Components: gobblin-core >Affects Versions: 0.14.0 >Reporter: Jay Sen >Assignee: Abhishek Tiwari >Priority: Major > Fix For: 0.15.0 > > > FsDataWriter.Java:107 > {{this.fileContext = FileContext.getFileContext(conf);}} > FileContext is used while moving data from staging to final output path. > currently it uses default hadoop Configuration which can be picked up by > local hadoop config in classpath, which is not right. the Writer fiulesystem > should be always driven by the writer config mentioned as per job or platform > config. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (GOBBLIN-900) FsDataWriter dont use writer config for FileSystem
[ https://issues.apache.org/jira/browse/GOBBLIN-900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16944156#comment-16944156 ] Jay Sen commented on GOBBLIN-900: - Looks like GOBBLIN-889 was filed for the same but solves bit differently. my fix was following one line change {{this.fileContext = FileContext.getFileContext(this.fs.getUri());}} {{this.fs}} is already above this code line which is basically the FileSystem object so we can directly get the uri from there. > FsDataWriter dont use writer config for FileSystem > -- > > Key: GOBBLIN-900 > URL: https://issues.apache.org/jira/browse/GOBBLIN-900 > Project: Apache Gobblin > Issue Type: Bug > Components: gobblin-core >Affects Versions: 0.14.0 >Reporter: Jay Sen >Assignee: Abhishek Tiwari >Priority: Major > Fix For: 0.15.0 > > > FsDataWriter.Java:107 > {{this.fileContext = FileContext.getFileContext(conf);}} > FileContext is used while moving data from staging to final output path. > currently it uses default hadoop Configuration which can be picked up by > local hadoop config in classpath, which is not right. the Writer fiulesystem > should be always driven by the writer config mentioned as per job or platform > config. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (GOBBLIN-898) successful Job completion wrongly calls onJobFailure
Jay Sen created GOBBLIN-898: --- Summary: successful Job completion wrongly calls onJobFailure Key: GOBBLIN-898 URL: https://issues.apache.org/jira/browse/GOBBLIN-898 Project: Apache Gobblin Issue Type: Bug Reporter: Jay Sen we came across this code snippet which registers jobListner for JobCompleteTimer and JobSucceededTimer event and looks like JobSucceededTimer event calls onJobFailure function instead of onJobSucced function. notifyListeners(this.jobContext, jobListener, TimingEvent.LauncherTimings.JOB_SUCCEEDED, new JobListenerAction() { @Override public void apply(JobListener jobListener, JobContext jobContext) throws Exception { jobListener.onJobFailure(jobContext); } }); Qinghe from our team found this, so reporting on behalf of him here -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (GOBBLIN-867) use logger from class instead of passing it around
Jay Sen created GOBBLIN-867: --- Summary: use logger from class instead of passing it around Key: GOBBLIN-867 URL: https://issues.apache.org/jira/browse/GOBBLIN-867 Project: Apache Gobblin Issue Type: Improvement Reporter: Jay Sen in some classes, mainly static methods, expects the logger, so user of the methods has to pass the logger, these static methods are utils anyway, should just have their own logger. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Updated] (GOBBLIN-707) combine & standardize all gobblin scripts into one master script & restructure configs accordingly
[ https://issues.apache.org/jira/browse/GOBBLIN-707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jay Sen updated GOBBLIN-707: Description: gobblin supports multiple modes of executions ( CLI, Standalone, cluster-master, cluster-worker, AWS, YARN, MR ) and various command lines utility to run cli and admin commands. The problem is each cli and execution mode has individual script to manage the service, which brings following problems. Having individual script introduces lot of issues # all scripts handles gobblin variables, user parameters differently, and its highly inconsistent among various different gobblin scripts, not to mention different features supported by different scripts. # functionality around start, stop, status checking and handling PID's among lot of other things, varies vastly as per the implementation of the script. # features like GC & JVM params, log4j file selection, classpath calculation, etc... exists in some gobblin scripts but not all, adding to inconsistent user experience. # code duplication: all the gobblin scripts share lot of common code to handle params, start, stop services, status checks, pid handling, etc... combining all the scripts into 1 not only makes maintenance easier but also brings clarity and consistency. # Basically, current 13 different scripts adds confusion to new user on how to use Gobblin or how to use it. Solution: 1. there can be one gobblin.sh script to handle all gobblin commands and deployment options as per following signature. NOTE: This {{gobblin.sh }} {{gobblin.sh }} {{commands values: admin, cli, statestore-check, statestore-clean, historystore-manager, classpath}} {{service values: standalone, cluster-master, cluster-worker, aws, yarn, mr, service}} with above change, following becomes valid command. {code:java} # all under GobblinCli class gobblin run listQuickApps –> gobblin cli run listQuickApps gobblin run -> gobblin cli run # class: JobStateToJsonConverter statestore-checker.sh -> gobblin cli job-state-to-json # class: StateStoreCleaner statestore-clean.sh -> the class is depricated so no need to migrate this over. # class: DatabaseJobHistoryStoreSchemaManager historystore-manager.sh -> gobblin cli job-store-schema-manager # class: Cli gobblin-admin.sh-> gobblin cli admin # all gobblin deployment modes gobblin-cluster-master.sh -> gobblin service cluster-master start|stop|status gobblin-cluster-worker.sh -> gobblin service cluster-worker start|stop|status gobblin-compaction.sh -> gobblin-compaction.sh ( kept as it is for now, can be migrated to new script framework) gobblin-mapreduce.sh-> gobblin service mapreduce start|stop|status gobblin-service.sh -> gobblin service service-manager start|stop|status gobblin-standalone.sh-> gobblin service standalone start|stop|status gobblin-yarn.sh -> gobblin service yarn start|stop|status {code} 2. Also all configurations for each mode needs to be structured and de-duped accordingly to make it clear on which config will be picked up for which execution mode. This would be well defined in command help instructions. {color:#ff} NOTE: this refactoring adds all cli and service commands to gobblin.sh and hence changes the syntax for all commands and services.{color} was: gobblin supports multiple modes of executions ( CLI, Standalone, cluster-master, cluster-worker, AWS, YARN, MR ) and various command lines utility to run cli and admin commands. The problem is each cli and execution mode has individual script to manage the service, which brings following problems. Having individual script introduces lot of issues # all scripts handles gobblin variables, user parameters differently, and its highly inconsistent among various different gobblin scripts, not to mention different features supported by different scripts. # functionality around start, stop, status checking and handling PID's among lot of other things, varies vastly as per the implementation of the script. # features like GC & JVM params, log4j file selection, classpath calculation, etc... exists in some gobblin scripts but not all, adding to inconsistent user experience. # code duplication: all the gobblin scripts share lot of common code to handle params, start, stop services, status checks, pid handling, etc... combining all the scripts into 1 not only makes maintenance easier but also brings clarity and consistency. # Basically, current 13 different scripts adds confusion to new user on how to use Gobblin or how to use it. Solution: 1. there can be one gobblin.sh script to handle all gobblin commands and deployment options as per following signature. NOTE: This {{gobblin.sh }} {{gobblin.sh }} {{commands values: admin, cli, statestore-check, statestore-clean, historystore-manager, classpath}} {{service values: standalone,
[jira] [Updated] (GOBBLIN-707) combine & standardize all gobblin scripts into one master script & restructure configs accordingly
[ https://issues.apache.org/jira/browse/GOBBLIN-707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jay Sen updated GOBBLIN-707: Description: gobblin supports multiple modes of executions ( CLI, Standalone, cluster-master, cluster-worker, AWS, YARN, MR ) and various command lines utility to run cli and admin commands. The problem is each cli and execution mode has individual script to manage the service, which brings following problems. Having individual script introduces lot of issues # all scripts handles gobblin variables, user parameters differently, and its highly inconsistent among various different gobblin scripts, not to mention different features supported by different scripts. # functionality around start, stop, status checking and handling PID's among lot of other things, varies vastly as per the implementation of the script. # features like GC & JVM params, log4j file selection, classpath calculation, etc... exists in some gobblin scripts but not all, adding to inconsistent user experience. # code duplication: all the gobblin scripts share lot of common code to handle params, start, stop services, status checks, pid handling, etc... combining all the scripts into 1 not only makes maintenance easier but also brings clarity and consistency. # Basically, current 13 different scripts adds confusion to new user on how to use Gobblin or how to use it. Solution: 1. there can be one gobblin.sh script to handle all gobblin commands and deployment options as per following signature. NOTE: This {{gobblin.sh }} {{gobblin.sh }} {{commands values: admin, cli, statestore-check, statestore-clean, historystore-manager, classpath}} {{service values: standalone, cluster-master, cluster-worker, aws, yarn, mr, service}} with above change, following becomes valid command. {code:java} # all under GobblinCli class gobblin run listQuickApps –> gobblin cli run listQuickApps gobblin run listQuickApps –> gobblin cli run listQuickApps gobblin run -> gobblin cli run # class: JobStateToJsonConverter statestore-checker.sh -> gobblin statestore-checker # class: StateStoreCleaner statestore-clean.sh -> gobblin statestore-clean # class: DatabaseJobHistoryStoreSchemaManager historystore-manager.sh -> gobblin historystore-manager # class: Cli gobblin-admin.sh-> gobblin admin # all gobblin deployment modes gobblin-cluster-master.sh -> gobblin cluster-mater start|stop|status gobblin-cluster-worker.sh -> gobblin cluster-mater start|stop|status gobblin-compaction.sh -> gobblin cluster-mater start|stop|status gobblin-env.sh -> gobblin cluster-mater start|stop|status gobblin-mapreduce.sh-> gobblin cluster-mater start|stop|status gobblin-service.sh -> gobblin cluster-mater start|stop|status gobblin-standalone.sh -> gobblin cluster-mater start|stop|status gobblin-yarn.sh -> gobblin cluster-mater start|stop|status {code} 2. Also configs needs to be structured and deduped accordingly to make it clear on which config will be picked up for which execution mode. {color:#ff} NOTE: this refactoring adds all cli and service commands to gobblin.sh and hence changes the syntax for all commands and services.{color} was: gobblin supports multiple modes of executions ( CLI, Standalone, cluster-master, cluster-worker, AWS, YARN, MR ) and various command lines utility to run cli and admin commands. There is a individual script for each of them. Having individual script introduces lot of issues # all scripts handles gobblin variables, user parameters differently, and its highly inconsistent among various different gobblin scripts # functionality around start, stop, status checking and handling PID's among lot of other things, varies vastly as per the implementation of the script. # features like GC & JVM params, log4j file selection, classpath calculation, etc... exists in some gobblin scripts but not all, adding to inconsistent user experience. # maintaining total 13 script would be too much effort. Also all the gobblin scripts share lot of common code to handle params, start, stop services, status checks, pid handling, etc... combining all the scripts into 1 not only makes maintenance easier but also brings clarity and consistency. Solution: 1. there can be one gobblin.sh script to handle all gobblin commands and deployment options as per following signature. NOTE: This {{gobblin.sh }} {{gobblin.sh }} {{commands values: admin, cli, statestore-check, statestore-clean, historystore-manager, classpath}} {{service values: standalone, cluster-master, cluster-worker, aws, yarn, mr, service}} with above change, following becomes valid command. {code:java} # all under GobblinCli class gobblin run listQuickApps –> gobblin cli run listQuickApps gobblin run listQuickApps –> gobblin cli run listQuickApps gobblin run -> gobblin cli run #
[jira] [Updated] (GOBBLIN-818) upgrade default hadoop versions to 2.7.x and hive version to 1.2
[ https://issues.apache.org/jira/browse/GOBBLIN-818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jay Sen updated GOBBLIN-818: Description: Gobblin uses old hive 1.0, compiling against Hive 1.2 is not compatible due to backward incompatible changes in Hive 1.2. Gobblin should also upgrade Hadoop from 2.3 to 2.7.7 at least which is very stable. was:Gobblin uses old hive 1.0, compiling against Hive 1.2 is not compatible due to backward incompatible changes in Hive 1.2 > upgrade default hadoop versions to 2.7.x and hive version to 1.2 > > > Key: GOBBLIN-818 > URL: https://issues.apache.org/jira/browse/GOBBLIN-818 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Jay Sen >Priority: Major > > Gobblin uses old hive 1.0, compiling against Hive 1.2 is not compatible due > to backward incompatible changes in Hive 1.2. > Gobblin should also upgrade Hadoop from 2.3 to 2.7.7 at least which is very > stable. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Updated] (GOBBLIN-818) upgrade default hadoop versions to 2.7.x and hive version to 1.2
[ https://issues.apache.org/jira/browse/GOBBLIN-818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jay Sen updated GOBBLIN-818: Description: Gobblin uses old hive 1.0, compiling against Hive 1.2 is not compatible due to backward incompatible changes in Hive 1.2 (was: Gobblin uses old hive 1.0, compiling against Hive 1.2 is not compatible due to backward incompatible changes in Hive 1.2 we should also update Hadoop libs from 2.3 to 2.7.7 at least which is very stable.) > upgrade default hadoop versions to 2.7.x and hive version to 1.2 > > > Key: GOBBLIN-818 > URL: https://issues.apache.org/jira/browse/GOBBLIN-818 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Jay Sen >Priority: Major > > Gobblin uses old hive 1.0, compiling against Hive 1.2 is not compatible due > to backward incompatible changes in Hive 1.2 -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Created] (GOBBLIN-866) Re-Arrange Gobblin Modules and Classes as per GIP 2
Jay Sen created GOBBLIN-866: --- Summary: Re-Arrange Gobblin Modules and Classes as per GIP 2 Key: GOBBLIN-866 URL: https://issues.apache.org/jira/browse/GOBBLIN-866 Project: Apache Gobblin Issue Type: Improvement Affects Versions: 0.14.0 Reporter: Jay Sen Fix For: 0.15.0 Re-Arrange Gobblin Modules and Classes as per GIP 2: https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=119547565 -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (GOBBLIN-830) update config key to define job launcher type
[ https://issues.apache.org/jira/browse/GOBBLIN-830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16921056#comment-16921056 ] Jay Sen commented on GOBBLIN-830: - To support both `launcher.type` and `job.launcher.type` - `JobLauncherType` is nested, need to taken out, since most of the `jobLauncher` is defined in `gobblin-runtime`, it needs to be placed under that module. - instead of handling if/else at multiple place, I should add `JobLauncherUtils.getJobLauncherType` method, but that would create circular dependency between `gobblin-utility` and `gobblin-runtime`. This is minor change compare to amount of refactoring that is required to support both properties which is basically touching the complexity of the project that requires proper refactoring ( [GIP2](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=119547565) ) > update config key to define job launcher type > - > > Key: GOBBLIN-830 > URL: https://issues.apache.org/jira/browse/GOBBLIN-830 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Jay Sen >Priority: Minor > Time Spent: 50m > Remaining Estimate: 0h > > config key JOB_LAUNCHER_TYPE_KEY = "launcher.type" should be set to > "job.launcher.type" -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Updated] (GOBBLIN-822) upgrade log4j to log4j2
[ https://issues.apache.org/jira/browse/GOBBLIN-822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jay Sen updated GOBBLIN-822: Description: log4j2 has routing appender that would be super useful and probably only way to achieve " job specific log files" functionality without meddling around fileHandler in log4j. Also log4j2 has lot of new functionalities and performance benefits (ref: HIVE-11304) was: log4j2 has routing appender that would be super useful and probably only way to achieve " job specific log files" functionality without meddling around fileHandler in log4j. Also log4j2 has lot of new functionalities and performance benefits > upgrade log4j to log4j2 > --- > > Key: GOBBLIN-822 > URL: https://issues.apache.org/jira/browse/GOBBLIN-822 > Project: Apache Gobblin > Issue Type: Sub-task >Affects Versions: 0.14.0 >Reporter: Jay Sen >Priority: Major > Fix For: 0.15.0 > > Time Spent: 1h 10m > Remaining Estimate: 0h > > log4j2 has routing appender that would be super useful and probably only way > to achieve " job specific log files" functionality without meddling around > fileHandler in log4j. > Also log4j2 has lot of new functionalities and performance benefits (ref: > HIVE-11304) -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Updated] (GOBBLIN-854) update config reader in standalone mode
[ https://issues.apache.org/jira/browse/GOBBLIN-854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jay Sen updated GOBBLIN-854: Description: standalone mode {{SchedulerDaemon}} uses java properties, this ticket is to use TypeSafe Config instead to make config standardized across the modes and also enable config to take benefits of TypeSafe functionalities. Also it takes 2 different config file as argument, one as default and another as custom, we probably only need one property file. was:standalone mode \{{SchedulerDaemon}} uses java properties, this ticket is to use TypeSafe Config instead to make config standardized across the modes and also enable config to take benefits of TypeSafe functionalities. > update config reader in standalone mode > --- > > Key: GOBBLIN-854 > URL: https://issues.apache.org/jira/browse/GOBBLIN-854 > Project: Apache Gobblin > Issue Type: Improvement >Affects Versions: 0.14.0 >Reporter: Jay Sen >Priority: Major > Fix For: 0.15.0 > > > standalone mode {{SchedulerDaemon}} uses java properties, this ticket is to > use TypeSafe Config instead to make config standardized across the modes and > also enable config to take benefits of TypeSafe functionalities. > Also it takes 2 different config file as argument, one as default and another > as custom, we probably only need one property file. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Created] (GOBBLIN-854) update config reader in standalone mode
Jay Sen created GOBBLIN-854: --- Summary: update config reader in standalone mode Key: GOBBLIN-854 URL: https://issues.apache.org/jira/browse/GOBBLIN-854 Project: Apache Gobblin Issue Type: Improvement Reporter: Jay Sen standalone mode \{{SchedulerDaemon}} uses java properties, this ticket is to use TypeSafe Config instead to make config standardized across the modes and also enable config to take benefits of TypeSafe functionalities. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Updated] (GOBBLIN-854) update config reader in standalone mode
[ https://issues.apache.org/jira/browse/GOBBLIN-854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jay Sen updated GOBBLIN-854: Affects Version/s: 0.14.0 > update config reader in standalone mode > --- > > Key: GOBBLIN-854 > URL: https://issues.apache.org/jira/browse/GOBBLIN-854 > Project: Apache Gobblin > Issue Type: Improvement >Affects Versions: 0.14.0 >Reporter: Jay Sen >Priority: Major > > standalone mode \{{SchedulerDaemon}} uses java properties, this ticket is to > use TypeSafe Config instead to make config standardized across the modes and > also enable config to take benefits of TypeSafe functionalities. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Updated] (GOBBLIN-854) update config reader in standalone mode
[ https://issues.apache.org/jira/browse/GOBBLIN-854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jay Sen updated GOBBLIN-854: Fix Version/s: 0.15.0 > update config reader in standalone mode > --- > > Key: GOBBLIN-854 > URL: https://issues.apache.org/jira/browse/GOBBLIN-854 > Project: Apache Gobblin > Issue Type: Improvement >Affects Versions: 0.14.0 >Reporter: Jay Sen >Priority: Major > Fix For: 0.15.0 > > > standalone mode \{{SchedulerDaemon}} uses java properties, this ticket is to > use TypeSafe Config instead to make config standardized across the modes and > also enable config to take benefits of TypeSafe functionalities. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (GOBBLIN-824) upgrade to latest libraries in Gobblin
[ https://issues.apache.org/jira/browse/GOBBLIN-824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16905621#comment-16905621 ] Jay Sen commented on GOBBLIN-824: - GOBBLIN-818 does minor upgrades. > upgrade to latest libraries in Gobblin > -- > > Key: GOBBLIN-824 > URL: https://issues.apache.org/jira/browse/GOBBLIN-824 > Project: Apache Gobblin > Issue Type: Improvement >Affects Versions: 0.14.0 >Reporter: Jay Sen >Priority: Major > Fix For: 0.15.0 > > > lot of libs are old, like hadoop, hive, etc... > it wont be easy to just comile gobblin with new version via passing new > version on command line, there is lot of changes since last couple of years. > Gobblin should use latest versions > Hadoop: 2.9.x > hive : 2.3.5 > pegasus: 24.0.2 > Avro : 1.8.2 > etc... > please feel free to mention which lib should be updated as part of this > overall upgrade process. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Updated] (GOBBLIN-824) upgrade to latest libraries in Gobblin
[ https://issues.apache.org/jira/browse/GOBBLIN-824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jay Sen updated GOBBLIN-824: Summary: upgrade to latest libraries in Gobblin (was: upgrade libs versions in Gobblin) > upgrade to latest libraries in Gobblin > -- > > Key: GOBBLIN-824 > URL: https://issues.apache.org/jira/browse/GOBBLIN-824 > Project: Apache Gobblin > Issue Type: Improvement >Affects Versions: 0.14.0 >Reporter: Jay Sen >Priority: Major > Fix For: 0.15.0 > > > lot of libs are old, like hadoop, hive, etc... > it wont be easy to just comile gobblin with new version via passing new > version on command line, there is lot of changes since last couple of years. > Gobblin should use latest versions > Hadoop: 2.9.x > hive : 2.3.5 > pegasus: 24.0.2 > Avro : 1.8.2 > etc... > please feel free to mention which lib should be updated as part of this > overall upgrade process. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Updated] (GOBBLIN-824) upgrade libs versions in Gobblin
[ https://issues.apache.org/jira/browse/GOBBLIN-824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jay Sen updated GOBBLIN-824: Description: lot of libs are old, like hadoop, hive, etc... it wont be easy to just comile gobblin with new version via passing new version on command line, there is lot of changes since last couple of years. Gobblin should use latest versions Hadoop: 2.9.x hive : 2.3.5 pegasus: 24.0.2 Avro : 1.8.2 etc... please feel free to mention which lib should be updated as part of this overall upgrade process. was: lot of libs are old, like hadoop, hive, etc... it wont be easy to just comile gobblin with new version via passing new version on command line, there is lot of changes since last couple of years. Gobblin should use latest versions hadoop : 2.7.7 hive : 2.3.5 pegasus: 24.0.2 etc... please feel free to mention which lib should be updated as part of this overall upgrade process. > upgrade libs versions in Gobblin > > > Key: GOBBLIN-824 > URL: https://issues.apache.org/jira/browse/GOBBLIN-824 > Project: Apache Gobblin > Issue Type: Improvement >Affects Versions: 0.14.0 >Reporter: Jay Sen >Priority: Major > Fix For: 0.15.0 > > > lot of libs are old, like hadoop, hive, etc... > it wont be easy to just comile gobblin with new version via passing new > version on command line, there is lot of changes since last couple of years. > Gobblin should use latest versions > Hadoop: 2.9.x > hive : 2.3.5 > pegasus: 24.0.2 > Avro : 1.8.2 > etc... > please feel free to mention which lib should be updated as part of this > overall upgrade process. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Updated] (GOBBLIN-818) upgrade default hadoop versions to 2.7.x and hive version to 1.2
[ https://issues.apache.org/jira/browse/GOBBLIN-818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jay Sen updated GOBBLIN-818: Description: Gobblin uses old hive 1.0, compiling against Hive 1.2 is not compatible due to backward incompatible changes in Hive 1.2 we should also update Hadoop libs from 2.3 to 2.7.7 at least which is very stable. was: Gobblin uses old hive 1.x. Hive 2.x has significant changes and some incompatible/deprecated classes. while hive 3.x is already in pipeline along with Hadoop 3.x, we should move to hive 2.x so user dont have to deal with manual fixes for their use of Gobblin. we should also update Hadoop libs from 2.3 to 2.7.7 at least which is very stable. > upgrade default hadoop versions to 2.7.x and hive version to 1.2 > > > Key: GOBBLIN-818 > URL: https://issues.apache.org/jira/browse/GOBBLIN-818 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Jay Sen >Priority: Major > > Gobblin uses old hive 1.0, compiling against Hive 1.2 is not compatible due > to backward incompatible changes in Hive 1.2 > we should also update Hadoop libs from 2.3 to 2.7.7 at least which is very > stable. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Updated] (GOBBLIN-818) upgrade default hadoop versions to 2.7.x and hive version to 1.2
[ https://issues.apache.org/jira/browse/GOBBLIN-818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jay Sen updated GOBBLIN-818: Summary: upgrade default hadoop versions to 2.7.x and hive version to 1.2 (was: MIgrate to Hive 2.x as default) > upgrade default hadoop versions to 2.7.x and hive version to 1.2 > > > Key: GOBBLIN-818 > URL: https://issues.apache.org/jira/browse/GOBBLIN-818 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Jay Sen >Priority: Major > > Gobblin uses old hive 1.x. > Hive 2.x has significant changes and some incompatible/deprecated classes. > while hive 3.x is already in pipeline along with Hadoop 3.x, we should move > to hive 2.x so user dont have to deal with manual fixes for their use of > Gobblin. > we should also update Hadoop libs from 2.3 to 2.7.7 at least which is very > stable. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Created] (GOBBLIN-850) remove duplicated buildFileSystem method
Jay Sen created GOBBLIN-850: --- Summary: remove duplicated buildFileSystem method Key: GOBBLIN-850 URL: https://issues.apache.org/jira/browse/GOBBLIN-850 Project: Apache Gobblin Issue Type: Improvement Affects Versions: 0.14.0 Reporter: Jay Sen Fix For: 0.15.0 {{FileSystem buildFileSystem(Properties jobProps, Configuration configuration)}} is duplicated in almost all deployment modes, such functions should be part of gobblin-core:utils. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Created] (GOBBLIN-845) upgrade to latest gradle
Jay Sen created GOBBLIN-845: --- Summary: upgrade to latest gradle Key: GOBBLIN-845 URL: https://issues.apache.org/jira/browse/GOBBLIN-845 Project: Apache Gobblin Issue Type: Improvement Reporter: Jay Sen currently Gobblin uses very old gradle version (2.13), current version is 5.5.1 there are lot of new features added since then. specially for project with this complexity will get help from latest features, specially around dependency management. This, although, requires lot of changes in gradle script to refactor due to deprecated features. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Updated] (GOBBLIN-845) upgrade to latest gradle
[ https://issues.apache.org/jira/browse/GOBBLIN-845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jay Sen updated GOBBLIN-845: Affects Version/s: 0.14.0 > upgrade to latest gradle > > > Key: GOBBLIN-845 > URL: https://issues.apache.org/jira/browse/GOBBLIN-845 > Project: Apache Gobblin > Issue Type: Improvement >Affects Versions: 0.14.0 >Reporter: Jay Sen >Priority: Major > > currently Gobblin uses very old gradle version (2.13), current version is > 5.5.1 > there are lot of new features added since then. > specially for project with this complexity will get help from latest > features, specially around dependency management. > This, although, requires lot of changes in gradle script to refactor due to > deprecated features. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Updated] (GOBBLIN-845) upgrade to latest gradle
[ https://issues.apache.org/jira/browse/GOBBLIN-845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jay Sen updated GOBBLIN-845: Fix Version/s: 0.15.0 > upgrade to latest gradle > > > Key: GOBBLIN-845 > URL: https://issues.apache.org/jira/browse/GOBBLIN-845 > Project: Apache Gobblin > Issue Type: Improvement >Affects Versions: 0.14.0 >Reporter: Jay Sen >Priority: Major > Fix For: 0.15.0 > > > currently Gobblin uses very old gradle version (2.13), current version is > 5.5.1 > there are lot of new features added since then. > specially for project with this complexity will get help from latest > features, specially around dependency management. > This, although, requires lot of changes in gradle script to refactor due to > deprecated features. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Created] (GOBBLIN-844) centralize all graddle scripts
Jay Sen created GOBBLIN-844: --- Summary: centralize all graddle scripts Key: GOBBLIN-844 URL: https://issues.apache.org/jira/browse/GOBBLIN-844 Project: Apache Gobblin Issue Type: Improvement Affects Versions: 0.14.0 Reporter: Jay Sen Fix For: 0.15.0 currently flavored and envrionment and some other gradle scripts are scattered, all should be under gradle dir scripts/ . all that facilitate the custom gradle build for gobblin. Task here is pretty much move the file and change the path of those gradle scripts accordingly. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Created] (GOBBLIN-843) Separately startable Admin UI & REST Server
Jay Sen created GOBBLIN-843: --- Summary: Separately startable Admin UI & REST Server Key: GOBBLIN-843 URL: https://issues.apache.org/jira/browse/GOBBLIN-843 Project: Apache Gobblin Issue Type: New Feature Affects Versions: 0.14.0 Reporter: Jay Sen Fix For: 0.15.0 currently, the admin UI & rest server starts and clubbed within the master process ( standalone, or cluster master/worker process ). If we can have ability to start the rest server and admin UI separately, it would help manage it better and decouple the deployment modes and admin/rest services. the gobblin service command should facilitate starting stopping admin/rest services. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Updated] (GOBBLIN-842) remove redundent TestAppender class
[ https://issues.apache.org/jira/browse/GOBBLIN-842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jay Sen updated GOBBLIN-842: Affects Version/s: 0.14.0 Fix Version/s: 0.15.0 > remove redundent TestAppender class > --- > > Key: GOBBLIN-842 > URL: https://issues.apache.org/jira/browse/GOBBLIN-842 > Project: Apache Gobblin > Issue Type: New Feature >Affects Versions: 0.14.0 >Reporter: Jay Sen >Priority: Minor > Fix For: 0.15.0 > > > for test case purpose TestAppender classes are created locally, it can be > common to remove redundent class -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (GOBBLIN-842) remove redundent TestAppender class
[ https://issues.apache.org/jira/browse/GOBBLIN-842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16898361#comment-16898361 ] Jay Sen commented on GOBBLIN-842: - can find reference and context of this change in GOBBLIN-822 > remove redundent TestAppender class > --- > > Key: GOBBLIN-842 > URL: https://issues.apache.org/jira/browse/GOBBLIN-842 > Project: Apache Gobblin > Issue Type: New Feature >Affects Versions: 0.14.0 >Reporter: Jay Sen >Priority: Minor > Fix For: 0.15.0 > > > for test case purpose TestAppender classes are created locally, it can be > common to remove redundent class -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Created] (GOBBLIN-842) remove redundent TestAppender class
Jay Sen created GOBBLIN-842: --- Summary: remove redundent TestAppender class Key: GOBBLIN-842 URL: https://issues.apache.org/jira/browse/GOBBLIN-842 Project: Apache Gobblin Issue Type: New Feature Reporter: Jay Sen for test case purpose TestAppender classes are created locally, it can be common to remove redundent class -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Created] (GOBBLIN-835) add date transformation converter
Jay Sen created GOBBLIN-835: --- Summary: add date transformation converter Key: GOBBLIN-835 URL: https://issues.apache.org/jira/browse/GOBBLIN-835 Project: Apache Gobblin Issue Type: New Feature Affects Versions: 0.14.0 Reporter: Jay Sen Fix For: 0.15.0 create new convertor for date format -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Created] (GOBBLIN-832) get specific tables instead of all tables from Hive DB
Jay Sen created GOBBLIN-832: --- Summary: get specific tables instead of all tables from Hive DB Key: GOBBLIN-832 URL: https://issues.apache.org/jira/browse/GOBBLIN-832 Project: Apache Gobblin Issue Type: Improvement Affects Versions: 0.14.0 Reporter: Jay Sen Fix For: 0.15.0 {{HiveDataSetFinder}} uses (Hive's) {{client.get().getAllDatabases()}} and {{client.get().getAllTables(db)}} which can be inefficient when DB have thousands of tables, getting all tables wont be efficient on metastore all the time. should use methods like {{listTableNamesByFilter}} or {{getTables}} which has ways to specify table pattern as well. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Created] (GOBBLIN-830) update config key to define job launcher type
Jay Sen created GOBBLIN-830: --- Summary: update config key to define job launcher type Key: GOBBLIN-830 URL: https://issues.apache.org/jira/browse/GOBBLIN-830 Project: Apache Gobblin Issue Type: Improvement Reporter: Jay Sen config key JOB_LAUNCHER_TYPE_KEY = "launcher.type" should be set to "job.launcher.type" -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Created] (GOBBLIN-824) upgrade libs versions in Gobblin
Jay Sen created GOBBLIN-824: --- Summary: upgrade libs versions in Gobblin Key: GOBBLIN-824 URL: https://issues.apache.org/jira/browse/GOBBLIN-824 Project: Apache Gobblin Issue Type: Improvement Affects Versions: 0.14.0 Reporter: Jay Sen Fix For: 0.15.0 lot of libs are old, like hadoop, hive, etc... it wont be easy to just comile gobblin with new version via passing new version on command line, there is lot of changes since last couple of years. Gobblin should use latest versions hadoop : 2.7.7 hive : 2.3.5 pegasus: 24.0.2 etc... please feel free to mention which lib should be updated as part of this overall upgrade process. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Updated] (GOBBLIN-822) upgrade log4j to log4j2
[ https://issues.apache.org/jira/browse/GOBBLIN-822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jay Sen updated GOBBLIN-822: Summary: upgrade log4j to log4j2 (was: update log4j to log4j2) > upgrade log4j to log4j2 > --- > > Key: GOBBLIN-822 > URL: https://issues.apache.org/jira/browse/GOBBLIN-822 > Project: Apache Gobblin > Issue Type: Sub-task >Affects Versions: 0.14.0 >Reporter: Jay Sen >Priority: Major > Fix For: 0.15.0 > > > log4j2 has routing appender that would be super useful and probably only way > to achieve " job specific log files" functionality without meddling around > fileHandler in log4j. > Also log4j2 has lot of new functionalities and performance benefits -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Updated] (GOBBLIN-822) update log4j to log4j2
[ https://issues.apache.org/jira/browse/GOBBLIN-822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jay Sen updated GOBBLIN-822: Description: log4j2 has routing appender that would be super useful and probably only way to achieve " job specific log files" functionality without meddling around fileHandler in log4j. Also log4j2 has lot of new functionalities and performance benefits was: log4j2 has routing appender that would be super useful and probably only way to achieve this functionality without meddling around fileHandler in log4j. Also log4j2 has lot of new functionalities and performance benefits > update log4j to log4j2 > -- > > Key: GOBBLIN-822 > URL: https://issues.apache.org/jira/browse/GOBBLIN-822 > Project: Apache Gobblin > Issue Type: Sub-task >Affects Versions: 0.14.0 >Reporter: Jay Sen >Priority: Major > Fix For: 0.15.0 > > > log4j2 has routing appender that would be super useful and probably only way > to achieve " job specific log files" functionality without meddling around > fileHandler in log4j. > Also log4j2 has lot of new functionalities and performance benefits -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Created] (GOBBLIN-823) upgrade or remove lombok usage
Jay Sen created GOBBLIN-823: --- Summary: upgrade or remove lombok usage Key: GOBBLIN-823 URL: https://issues.apache.org/jira/browse/GOBBLIN-823 Project: Apache Gobblin Issue Type: Improvement Affects Versions: 0.14.0 Reporter: Jay Sen Fix For: 0.15.0 Lombok is useful once developer gets used to it, but it has itss own learning curve for new user, given lombok adds unnecessary complexity, mot of the things can be taken care by smart IDE anyway, I suggest we remove use of lomboke. if not, lets at least upgrade to latest version to get all bug fixes. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Updated] (GOBBLIN-788) job specific log files
[ https://issues.apache.org/jira/browse/GOBBLIN-788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jay Sen updated GOBBLIN-788: Affects Version/s: 0.14.0 > job specific log files > -- > > Key: GOBBLIN-788 > URL: https://issues.apache.org/jira/browse/GOBBLIN-788 > Project: Apache Gobblin > Issue Type: New Feature >Affects Versions: 0.14.0 >Reporter: Jay Sen >Priority: Major > > Each job or task running on individual node should create separate log file > (with some standard file naming convention) which has all the logging > specific to that job or task only. The primary platform specific logging can > still go to {{logs/.[out|err]}} > This would benefit in easy maintenance and operations and debugging. > -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Updated] (GOBBLIN-788) job specific log files
[ https://issues.apache.org/jira/browse/GOBBLIN-788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jay Sen updated GOBBLIN-788: Fix Version/s: 0.15.0 > job specific log files > -- > > Key: GOBBLIN-788 > URL: https://issues.apache.org/jira/browse/GOBBLIN-788 > Project: Apache Gobblin > Issue Type: New Feature >Affects Versions: 0.14.0 >Reporter: Jay Sen >Priority: Major > Fix For: 0.15.0 > > > Each job or task running on individual node should create separate log file > (with some standard file naming convention) which has all the logging > specific to that job or task only. The primary platform specific logging can > still go to {{logs/.[out|err]}} > This would benefit in easy maintenance and operations and debugging. > -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Created] (GOBBLIN-822) update log4j to log4j2
Jay Sen created GOBBLIN-822: --- Summary: update log4j to log4j2 Key: GOBBLIN-822 URL: https://issues.apache.org/jira/browse/GOBBLIN-822 Project: Apache Gobblin Issue Type: Sub-task Affects Versions: 0.14.0 Reporter: Jay Sen Fix For: 0.15.0 log4j2 has routing appender that would be super useful and probably only way to achieve this functionality without meddling around fileHandler in log4j. Also log4j2 has lot of new functionalities and performance benefits -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Created] (GOBBLIN-818) MIgrate to Hive 2.x as default
Jay Sen created GOBBLIN-818: --- Summary: MIgrate to Hive 2.x as default Key: GOBBLIN-818 URL: https://issues.apache.org/jira/browse/GOBBLIN-818 Project: Apache Gobblin Issue Type: Improvement Reporter: Jay Sen Gobblin uses old hive 1.x. Hive 2.x has significant changes and some incompatible/deprecated classes. while hive 3.x is already in pipeline along with Hadoop 3.x, we should move to hive 2.x so user dont have to deal with manual fixes for their use of Gobblin. we should also update Hadoop libs from 2.3 to 2.7.7 at least which is very stable. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (GOBBLIN-707) combine & standardize all gobblin scripts into one master script & restructure configs accordingly
[ https://issues.apache.org/jira/browse/GOBBLIN-707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16878203#comment-16878203 ] Jay Sen commented on GOBBLIN-707: - [~ibuenros], pls take a look. > combine & standardize all gobblin scripts into one master script & > restructure configs accordingly > -- > > Key: GOBBLIN-707 > URL: https://issues.apache.org/jira/browse/GOBBLIN-707 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Jay Sen >Priority: Major > Time Spent: 9h > Remaining Estimate: 0h > > gobblin supports multiple modes of executions ( CLI, Standalone, > cluster-master, cluster-worker, AWS, YARN, MR ) and various command lines > utility to run cli and admin commands. There is a individual script for each > of them. > Having individual script introduces lot of issues > # all scripts handles gobblin variables, user parameters differently, and > its highly inconsistent among various different gobblin scripts > # functionality around start, stop, status checking and handling PID's among > lot of other things, varies vastly as per the implementation of the script. > # features like GC & JVM params, log4j file selection, classpath > calculation, etc... exists in some gobblin scripts but not all, adding to > inconsistent user experience. > # maintaining total 13 script would be too much effort. > Also all the gobblin scripts share lot of common code to handle params, > start, stop services, status checks, pid handling, etc... combining all the > scripts into 1 not only makes maintenance easier but also brings clarity and > consistency. > > Solution: > 1. there can be one gobblin.sh script to handle all gobblin commands and > deployment options as per following signature. NOTE: This > {{gobblin.sh }} > {{gobblin.sh }} > {{commands values: admin, cli, statestore-check, statestore-clean, > historystore-manager, classpath}} > {{service values: standalone, cluster-master, cluster-worker, aws, yarn, mr, > service}} > with above change, following becomes valid command. > {code:java} > # all under GobblinCli class > gobblin run listQuickApps –> gobblin cli run listQuickApps > gobblin run listQuickApps –> gobblin cli run listQuickApps > gobblin run -> gobblin cli run > # class: JobStateToJsonConverter > statestore-checker.sh -> gobblin statestore-checker > # class: StateStoreCleaner > statestore-clean.sh -> gobblin statestore-clean > # class: DatabaseJobHistoryStoreSchemaManager > historystore-manager.sh -> gobblin historystore-manager > # class: Cli > gobblin-admin.sh-> gobblin admin > # all gobblin deployment modes > gobblin-cluster-master.sh -> gobblin cluster-mater start|stop|status > gobblin-cluster-worker.sh -> gobblin cluster-mater start|stop|status > gobblin-compaction.sh -> gobblin cluster-mater start|stop|status > gobblin-env.sh -> gobblin cluster-mater start|stop|status > gobblin-mapreduce.sh-> gobblin cluster-mater start|stop|status > gobblin-service.sh -> gobblin cluster-mater start|stop|status > gobblin-standalone.sh -> gobblin cluster-mater start|stop|status > gobblin-yarn.sh -> gobblin cluster-mater start|stop|status > {code} > > 2. Also configs needs to be structured and deduped accordingly to make it > clear on which config will be picked up for which execution mode. > {color:#ff} > NOTE: this refactoring adds all cli and service commands to gobblin.sh and > hence changes the syntax for all commands and services.{color} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (GOBBLIN-788) job specific log files
[ https://issues.apache.org/jira/browse/GOBBLIN-788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jay Sen updated GOBBLIN-788: Description: Each job or task running on individual node should create separate log file (with some standard file naming convention) which has all the logging specific to that job or task only. The primary platform specific logging can still go to {{logs/.[out|err]}} This would benefit in easy maintenance and operations and debugging. was: Each job or task running on individual node should create separate log file (with some standard convention) which has all the logging specific to that job or task only. The primary platform specific logging can still go to {{logs/.[out|err]}} This would benefit in lot of use-cases. > job specific log files > -- > > Key: GOBBLIN-788 > URL: https://issues.apache.org/jira/browse/GOBBLIN-788 > Project: Apache Gobblin > Issue Type: New Feature >Reporter: Jay Sen >Priority: Major > > Each job or task running on individual node should create separate log file > (with some standard file naming convention) which has all the logging > specific to that job or task only. The primary platform specific logging can > still go to {{logs/.[out|err]}} > This would benefit in easy maintenance and operations and debugging. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (GOBBLIN-811) non gobblin job files should not throw exception
[ https://issues.apache.org/jira/browse/GOBBLIN-811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jay Sen updated GOBBLIN-811: Description: any other files, other than .pull or .conf under job dir should be ignored with DEBUG or WARN message instead of throwing exception. for example: {{2019-06-22 21:21:05 PDT ERROR [newDaemonThreadFactory] org.apache.gobblin.util.filesystem.ExceptionCatchingPathAlterationListenerDecorator - onFileCreate failure: java.lang.RuntimeException: Cannot load pull file file:/tools/gobblin-dist/gobblin-cluster-data/jobs/test_job.pull_old due to unrecognized extension. at org.apache.gobblin.runtime.job_catalog.FSPathAlterationListenerAdaptor.onFileCreate(FSPathAlterationListenerAdaptor.java:61) at org.apache.gobblin.util.filesystem.ExceptionCatchingPathAlterationListenerDecorator.onFileCreate(ExceptionCatchingPathAlterationListenerDecorator.java:55) at org.apache.gobblin.util.filesystem.PathAlterationObserver.doCreate(PathAlterationObserver.java:276) at org.apache.gobblin.util.filesystem.PathAlterationObserver.checkAndNotify(PathAlterationObserver.java:210) at org.apache.gobblin.util.filesystem.PathAlterationObserver.checkAndNotify(PathAlterationObserver.java:180) at org.apache.gobblin.util.filesystem.PathAlterationObserverScheduler.run(PathAlterationObserverScheduler.java:163) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)}} was: any other files, other than .pull or .conf under job dir should be ignored with DEBUG or WARN message instead of throwing exception. for example: {{2019-06-22 21:21:05 PDT ERROR [newDaemonThreadFactory] org.apache.gobblin.util.filesystem.ExceptionCatchingPathAlterationListenerDecorator - onFileCreate failure: java.lang.RuntimeException: Cannot load pull file file:/tools/gobblin-dist/gobblin-cluster-data/jobs/test_job.pull_old due to unrecognized extension. at org.apache.gobblin.runtime.job_catalog.FSPathAlterationListenerAdaptor.onFileCreate(FSPathAlterationListenerAdaptor.java:61) at org.apache.gobblin.util.filesystem.ExceptionCatchingPathAlterationListenerDecorator.onFileCreate(ExceptionCatchingPathAlterationListenerDecorator.java:55) at org.apache.gobblin.util.filesystem.PathAlterationObserver.doCreate(PathAlterationObserver.java:276) at org.apache.gobblin.util.filesystem.PathAlterationObserver.checkAndNotify(PathAlterationObserver.java:210) at org.apache.gobblin.util.filesystem.PathAlterationObserver.checkAndNotify(PathAlterationObserver.java:180) at org.apache.gobblin.util.filesystem.PathAlterationObserverScheduler.run(PathAlterationObserverScheduler.java:163) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)}} > non gobblin job files should not throw exception > > > Key: GOBBLIN-811 > URL: https://issues.apache.org/jira/browse/GOBBLIN-811 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Jay Sen >Priority: Minor > > any other files, other than .pull or .conf under job dir should be ignored > with DEBUG or WARN message instead of throwing exception. > for example: > {{2019-06-22 21:21:05 PDT ERROR [newDaemonThreadFactory] > org.apache.gobblin.util.filesystem.ExceptionCatchingPathAlterationListenerDecorator > - onFileCreate failure: > java.lang.RuntimeException: Cannot load pull file > file:/tools/gobblin-dist/gobblin-cluster-data/jobs/test_job.pull_old due to > unrecognized extension. > at > org.apache.gobblin.runtime.job_catalog.FSPathAlterationListenerAdaptor.onFileCreate(FSPathAlterationListenerAdaptor.java:61) > at > org.apache.gobblin.util.filesystem.ExceptionCatchingPathAlterationListenerDecorator.onFileCreate(ExceptionCatchingPathAlterationListenerDecorator.java:55) > at >
[jira] [Commented] (GOBBLIN-613) Use the Hadoop tokens provided by WormholePushJob, instead of negotiating them via Gobblin
[ https://issues.apache.org/jira/browse/GOBBLIN-613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16875162#comment-16875162 ] Jay Sen commented on GOBBLIN-613: - [~rakeshmalladi2018], can u provide some more detail here ? Thanks > Use the Hadoop tokens provided by WormholePushJob, instead of negotiating > them via Gobblin > -- > > Key: GOBBLIN-613 > URL: https://issues.apache.org/jira/browse/GOBBLIN-613 > Project: Apache Gobblin > Issue Type: New Feature >Reporter: Rakesh Malladi >Priority: Major > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (GOBBLIN-812) take worker id from command line if specified
Jay Sen created GOBBLIN-812: --- Summary: take worker id from command line if specified Key: GOBBLIN-812 URL: https://issues.apache.org/jira/browse/GOBBLIN-812 Project: Apache Gobblin Issue Type: New Feature Components: gobblin-cluster Affects Versions: 0.14.0 Reporter: Jay Sen Assignee: Hung Tran Fix For: 0.15.0 {{GobblinTaskRunner}} can take and use worker id from command line if specified while starting the worker. -- This message was sent by Atlassian JIRA (v7.6.3#76005)