[jira] [Created] (GOBBLIN-1205) restarting gobblin on yarn fails with error

2020-06-20 Thread Jay Sen (Jira)
Jay Sen created GOBBLIN-1205:


 Summary: restarting gobblin on yarn fails with error
 Key: GOBBLIN-1205
 URL: https://issues.apache.org/jira/browse/GOBBLIN-1205
 Project: Apache Gobblin
  Issue Type: Bug
Affects Versions: 0.15.0
Reporter: Jay Sen
 Fix For: 0.15.0


restarting gobblin deployed on yarn mode occasionally fails starting up with 
following error, may be the path is still on hold by the previous process, it 
may just need bit time between stop/start.
{code:java}
WARN [ZKHelixAdmin] Root directory exists.Cleaning the root 
directory:/GobblinYarnHelixAppWARN [ZKHelixAdmin] Root directory 
exists.Cleaning the root directory:/GobblinYarnHelixAppWARN [ZkClient] Failed 
to delete path /GobblinYarnHelixApp/CONTROLLER! 
org.I0Itec.zkclient.exception.ZkException: 
org.apache.zookeeper.KeeperException$NotEmptyException: KeeperErrorCode = 
Directory not empty for /GobblinYarnHelixApp/CONTROLLERERROR [ZkClient] Failed 
to delete 
/GobblinYarnHelixApp/CONTROLLERorg.I0Itec.zkclient.exception.ZkException: 
org.apache.zookeeper.KeeperException$NotEmptyException: KeeperErrorCode = 
Directory not empty for /GobblinYarnHelixApp/CONTROLLER at 
org.I0Itec.zkclient.exception.ZkException.create(ZkException.java:68) at 
org.apache.helix.manager.zk.zookeeper.ZkClient.retryUntilConnected(ZkClient.java:1160)
 at org.apache.helix.manager.zk.zookeeper.ZkClient.delete(ZkClient.java:1215) 
at 
org.apache.helix.manager.zk.zookeeper.ZkClient.deleteRecursively(ZkClient.java:949)
 at 
org.apache.helix.manager.zk.zookeeper.ZkClient.deleteRecursively(ZkClient.java:942)
 at org.apache.helix.manager.zk.ZKHelixAdmin.addCluster(ZKHelixAdmin.java:698) 
at org.apache.helix.tools.ClusterSetup.addCluster(ClusterSetup.java:162) at 
org.apache.gobblin.cluster.HelixUtils.createGobblinHelixCluster(HelixUtils.java:96)
 at 
org.apache.gobblin.yarn.GobblinYarnAppLauncher.launch(GobblinYarnAppLauncher.java:337)
 at 
org.apache.gobblin.yarn.GobblinYarnAppLauncher.main(GobblinYarnAppLauncher.java:1067)Caused
 by: org.apache.zookeeper.KeeperException$NotEmptyException: KeeperErrorCode = 
Directory not empty for /GobblinYarnHelixApp/CONTROLLER at 
org.apache.zookeeper.KeeperException.create(KeeperException.java:128) at 
org.apache.zookeeper.KeeperException.create(KeeperException.java:54) at 
org.apache.zookeeper.ZooKeeper.delete(ZooKeeper.java:882) at 
org.apache.helix.manager.zk.zookeeper.ZkConnection.delete(ZkConnection.java:119)
 at org.apache.helix.manager.zk.zookeeper.ZkClient$9.call(ZkClient.java:1219) 
at 
org.apache.helix.manager.zk.zookeeper.ZkClient.retryUntilConnected(ZkClient.java:1150)
 ... 8 more
==> logs/yarn.err <==Exception in thread "main" 
org.apache.helix.HelixException: Failed to delete 
/GobblinYarnHelixApp/CONTROLLER at 
org.apache.helix.manager.zk.zookeeper.ZkClient.deleteRecursively(ZkClient.java:952)
 at 
org.apache.helix.manager.zk.zookeeper.ZkClient.deleteRecursively(ZkClient.java:942)
 at org.apache.helix.manager.zk.ZKHelixAdmin.addCluster(ZKHelixAdmin.java:698) 
at org.apache.helix.tools.ClusterSetup.addCluster(ClusterSetup.java:162) at 
org.apache.gobblin.cluster.HelixUtils.createGobblinHelixCluster(HelixUtils.java:96)
 at 
org.apache.gobblin.yarn.GobblinYarnAppLauncher.launch(GobblinYarnAppLauncher.java:337)
 at 
org.apache.gobblin.yarn.GobblinYarnAppLauncher.main(GobblinYarnAppLauncher.java:1067)Caused
 by: org.I0Itec.zkclient.exception.ZkException: 
org.apache.zookeeper.KeeperException$NotEmptyException: KeeperErrorCode = 
Directory not empty for /GobblinYarnHelixApp/CONTROLLER at 
org.I0Itec.zkclient.exception.ZkException.create(ZkException.java:68) at 
org.apache.helix.manager.zk.zookeeper.ZkClient.retryUntilConnected(ZkClient.java:1160)
 at org.apache.helix.manager.zk.zookeeper.ZkClient.delete(ZkClient.java:1215) 
at 
org.apache.helix.manager.zk.zookeeper.ZkClient.deleteRecursively(ZkClient.java:949)
 ... 6 moreCaused by: org.apache.zookeeper.KeeperException$NotEmptyException: 
KeeperErrorCode = Directory not empty for /GobblinYarnHelixApp/CONTROLLER at 
org.apache.zookeeper.KeeperException.create(KeeperException.java:128) at 
org.apache.zookeeper.KeeperException.create(KeeperException.java:54) at 
org.apache.zookeeper.ZooKeeper.delete(ZooKeeper.java:882) at 
org.apache.helix.manager.zk.zookeeper.ZkConnection.delete(ZkConnection.java:119)
 at org.apache.helix.manager.zk.zookeeper.ZkClient$9.call(ZkClient.java:1219) 
at 
org.apache.helix.manager.zk.zookeeper.ZkClient.retryUntilConnected(ZkClient.java:1150)
 ... 8 moreException in thread "Thread-6" org.apache.helix.HelixException: 
HelixManager (ZkClient) is not connected. Call HelixManager#connect() at 
org.apache.helix.manager.zk.ZKHelixManager.checkConnected(ZKHelixManager.java:363)
 at 
org.apache.helix.manager.zk.ZKHelixManager.getClusterManagmentTool(ZKHelixManager.java:908)
 at 

[jira] [Created] (GOBBLIN-1204) stopping yarn does not stop application master or containers

2020-06-20 Thread Jay Sen (Jira)
Jay Sen created GOBBLIN-1204:


 Summary: stopping yarn does not stop application master or 
containers
 Key: GOBBLIN-1204
 URL: https://issues.apache.org/jira/browse/GOBBLIN-1204
 Project: Apache Gobblin
  Issue Type: Bug
Affects Versions: 0.15.0
Reporter: Jay Sen
 Fix For: 0.15.0


The gobblin yarn command to stop the yarn mode should also stop the yarn 
application on hadoop including all its containers.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (GOBBLIN-1194) add Google cloud connector

2020-06-13 Thread Jay Sen (Jira)
Jay Sen created GOBBLIN-1194:


 Summary: add Google cloud connector
 Key: GOBBLIN-1194
 URL: https://issues.apache.org/jira/browse/GOBBLIN-1194
 Project: Apache Gobblin
  Issue Type: New Feature
Affects Versions: 0.15.0
Reporter: Jay Sen
 Fix For: 0.15.0


This task is for addnig Google Cloud Connector in Gobblin to support writing t0 
GCS.

ref: [https://cloud.google.com/dataproc/docs/concepts/connectors/cloud-storage]

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (GOBBLIN-1182) Incorrect datetime value error on mysql 5.7

2020-06-04 Thread Jay Sen (Jira)
Jay Sen created GOBBLIN-1182:


 Summary: Incorrect datetime value error on mysql 5.7
 Key: GOBBLIN-1182
 URL: https://issues.apache.org/jira/browse/GOBBLIN-1182
 Project: Apache Gobblin
  Issue Type: Improvement
  Components: gobblin-core
Affects Versions: 0.15.0
Reporter: Jay Sen
Assignee: Abhishek Tiwari
 Fix For: 0.15.0


jobExecution record insertion with default value of epoch timestamp throws 
error on MySql 5.7+ 

the reason is MySQL 5.7+ will convert timestamp to local for internal storage 
and by default errors out anything less than epoch time unless server sets the 
zero or negative time allowed explicitly.

as a quick fix, Gobblin can use 1970-01-02 00:00:00 which ensures all world's 
localtime is covered. alternate solution can be not to set the end_Date at all 
and allow null to indicate "not set".



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (GOBBLIN-1161) Hadoop 3.x support

2020-05-26 Thread Jay Sen (Jira)
Jay Sen created GOBBLIN-1161:


 Summary: Hadoop 3.x support
 Key: GOBBLIN-1161
 URL: https://issues.apache.org/jira/browse/GOBBLIN-1161
 Project: Apache Gobblin
  Issue Type: New Feature
Reporter: Jay Sen


To add support for hadoop 3.x

There are already some breaking changes so we may have to brainstorm about how 
to add this support.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (GOBBLIN-1139) Basic Aerospike key value writer with Avro schema

2020-05-01 Thread Jay Sen (Jira)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-1139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Sen updated GOBBLIN-1139:
-
Summary: Basic Aerospike key value writer with Avro schema  (was: Basic 
Aerospike writer with Avro schema)

> Basic Aerospike key value writer with Avro schema
> -
>
> Key: GOBBLIN-1139
> URL: https://issues.apache.org/jira/browse/GOBBLIN-1139
> Project: Apache Gobblin
>  Issue Type: Sub-task
>Reporter: Jay Sen
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (GOBBLIN-1138) AeroSpike Writer

2020-05-01 Thread Jay Sen (Jira)
Jay Sen created GOBBLIN-1138:


 Summary: AeroSpike Writer
 Key: GOBBLIN-1138
 URL: https://issues.apache.org/jira/browse/GOBBLIN-1138
 Project: Apache Gobblin
  Issue Type: New Feature
  Components: gobblin-connectors
Affects Versions: 0.15.0
Reporter: Jay Sen
Assignee: Shirshanka Das
 Fix For: 0.15.0


Aerospike is a very efficient key value store with  high throughput. 

This story is to add Aerospike Writer to Gobblin. This should have features 
like following.
 # Basic key value writer
 # Abstract writer with implementation with support for Avro schema
 # Sync/Async write modes
 # throttling integration
 # unsecure and secure connection handling
 # error & retries handling ( retry queue for data delivery SLA )
 # metrics integration specific to the writer

This would be phased development with subtasks. Please feel free to coordinate 
for contribution.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (GOBBLIN-1139) Basic Aerospike writer with Avro schema

2020-05-01 Thread Jay Sen (Jira)
Jay Sen created GOBBLIN-1139:


 Summary: Basic Aerospike writer with Avro schema
 Key: GOBBLIN-1139
 URL: https://issues.apache.org/jira/browse/GOBBLIN-1139
 Project: Apache Gobblin
  Issue Type: Sub-task
Reporter: Jay Sen






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (GOBBLIN-1128) Gobblin JobStore on MySQL with CRUD API

2020-04-25 Thread Jay Sen (Jira)
Jay Sen created GOBBLIN-1128:


 Summary: Gobblin JobStore on MySQL with CRUD API
 Key: GOBBLIN-1128
 URL: https://issues.apache.org/jira/browse/GOBBLIN-1128
 Project: Apache Gobblin
  Issue Type: Improvement
  Components: gobblin-api, gobblin-config
Affects Versions: 0.15.0
Reporter: Jay Sen
Assignee: Hung Tran
 Fix For: 0.15.0


[https://cwiki.apache.org/confluence/display/GOBBLIN/GIP+4%3A+MySQL+backed+job+config+store]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (GOBBLIN-1128) Gobblin JobStore on MySQL with CRUD API

2020-04-25 Thread Jay Sen (Jira)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-1128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Sen updated GOBBLIN-1128:
-
Description: 
[details: 
https://cwiki.apache.org/confluence/display/GOBBLIN/GIP+4%3A+MySQL+backed+job+config+store|https://cwiki.apache.org/confluence/display/GOBBLIN/GIP+4%3A+MySQL+backed+job+config+store]

 

  
was:[https://cwiki.apache.org/confluence/display/GOBBLIN/GIP+4%3A+MySQL+backed+job+config+store]

 Issue Type: New Feature  (was: Improvement)

> Gobblin JobStore on MySQL with CRUD API
> ---
>
> Key: GOBBLIN-1128
> URL: https://issues.apache.org/jira/browse/GOBBLIN-1128
> Project: Apache Gobblin
>  Issue Type: New Feature
>  Components: gobblin-api, gobblin-config
>Affects Versions: 0.15.0
>Reporter: Jay Sen
>Assignee: Hung Tran
>Priority: Major
> Fix For: 0.15.0
>
>
> [details: 
> https://cwiki.apache.org/confluence/display/GOBBLIN/GIP+4%3A+MySQL+backed+job+config+store|https://cwiki.apache.org/confluence/display/GOBBLIN/GIP+4%3A+MySQL+backed+job+config+store]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (GOBBLIN-1095) jcenter only supports https now

2020-03-23 Thread Jay Sen (Jira)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-1095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Sen closed GOBBLIN-1095.

Resolution: Duplicate

> jcenter only supports https now
> ---
>
> Key: GOBBLIN-1095
> URL: https://issues.apache.org/jira/browse/GOBBLIN-1095
> Project: Apache Gobblin
>  Issue Type: Improvement
>Affects Versions: 0.15.0
>Reporter: Jay Sen
>Priority: Critical
> Fix For: 0.15.0
>
>
> [http://jcenter.bintray.com/] repo no longer works to download libraries. 
> need to update it to https now.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (GOBBLIN-1095) jcenter only supports https now

2020-03-23 Thread Jay Sen (Jira)
Jay Sen created GOBBLIN-1095:


 Summary: jcenter only supports https now
 Key: GOBBLIN-1095
 URL: https://issues.apache.org/jira/browse/GOBBLIN-1095
 Project: Apache Gobblin
  Issue Type: Improvement
Affects Versions: 0.15.0
Reporter: Jay Sen
 Fix For: 0.15.0


[http://jcenter.bintray.com/] repo no longer works to download libraries. need 
to update it to https now.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GOBBLIN-984) Get Gobblin log outputting to work

2019-11-27 Thread Jay Sen (Jira)


[ 
https://issues.apache.org/jira/browse/GOBBLIN-984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16983968#comment-16983968
 ] 

Jay Sen commented on GOBBLIN-984:
-

Pls feel free to revert. 

> Get Gobblin log outputting to work
> --
>
> Key: GOBBLIN-984
> URL: https://issues.apache.org/jira/browse/GOBBLIN-984
> Project: Apache Gobblin
>  Issue Type: Bug
>Reporter: William Lo
>Priority: Minor
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> [GOBBLIN-822]|[https://github.com/apache/incubator-gobblin/pull/2685] 
> Introduced issues with the scripts not outputting logs. Need to include 
> log4j2.xml in each service's `conf` folder



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GOBBLIN-984) Get Gobblin log outputting to work

2019-11-27 Thread Jay Sen (Jira)


[ 
https://issues.apache.org/jira/browse/GOBBLIN-984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16983967#comment-16983967
 ] 

Jay Sen commented on GOBBLIN-984:
-

Sure. Didnt know you guys r using master branch.  I will setup sometime to talk 
about this.  Thanks

> Get Gobblin log outputting to work
> --
>
> Key: GOBBLIN-984
> URL: https://issues.apache.org/jira/browse/GOBBLIN-984
> Project: Apache Gobblin
>  Issue Type: Bug
>Reporter: William Lo
>Priority: Minor
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> [GOBBLIN-822]|[https://github.com/apache/incubator-gobblin/pull/2685] 
> Introduced issues with the scripts not outputting logs. Need to include 
> log4j2.xml in each service's `conf` folder



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GOBBLIN-984) Get Gobblin log outputting to work

2019-11-27 Thread Jay Sen (Jira)


[ 
https://issues.apache.org/jira/browse/GOBBLIN-984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16983963#comment-16983963
 ] 

Jay Sen commented on GOBBLIN-984:
-

Can we meet later today anytime after 4pm? I will send you the zoom link to 
join.  Thanks

> Get Gobblin log outputting to work
> --
>
> Key: GOBBLIN-984
> URL: https://issues.apache.org/jira/browse/GOBBLIN-984
> Project: Apache Gobblin
>  Issue Type: Bug
>Reporter: William Lo
>Priority: Minor
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> [GOBBLIN-822]|[https://github.com/apache/incubator-gobblin/pull/2685] 
> Introduced issues with the scripts not outputting logs. Need to include 
> log4j2.xml in each service's `conf` folder



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GOBBLIN-984) Get Gobblin log outputting to work

2019-11-27 Thread Jay Sen (Jira)


[ 
https://issues.apache.org/jira/browse/GOBBLIN-984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16983958#comment-16983958
 ] 

Jay Sen commented on GOBBLIN-984:
-

We can work that out with slf4j integration
Lets try to figure it out. Do u want to meet up online?

> Get Gobblin log outputting to work
> --
>
> Key: GOBBLIN-984
> URL: https://issues.apache.org/jira/browse/GOBBLIN-984
> Project: Apache Gobblin
>  Issue Type: Bug
>Reporter: William Lo
>Priority: Minor
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> [GOBBLIN-822]|[https://github.com/apache/incubator-gobblin/pull/2685] 
> Introduced issues with the scripts not outputting logs. Need to include 
> log4j2.xml in each service's `conf` folder



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (GOBBLIN-985) upgrade Guava libs

2019-11-26 Thread Jay Sen (Jira)
Jay Sen created GOBBLIN-985:
---

 Summary: upgrade Guava libs
 Key: GOBBLIN-985
 URL: https://issues.apache.org/jira/browse/GOBBLIN-985
 Project: Apache Gobblin
  Issue Type: Sub-task
Affects Versions: 0.14.0
Reporter: Jay Sen
 Fix For: 0.15.0


Gobblin is using very old guava lib ( 0.15 is from 2013 ), the latest is 0.26



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GOBBLIN-984) Get Gobblin log outputting to work

2019-11-26 Thread Jay Sen (Jira)


[ 
https://issues.apache.org/jira/browse/GOBBLIN-984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16982927#comment-16982927
 ] 

Jay Sen commented on GOBBLIN-984:
-

created this PR: [https://github.com/apache/incubator-gobblin/pull/2830]

we also need this PR to work together to standardize all other config and env 
issues : [https://github.com/apache/incubator-gobblin/pull/2788]

> Get Gobblin log outputting to work
> --
>
> Key: GOBBLIN-984
> URL: https://issues.apache.org/jira/browse/GOBBLIN-984
> Project: Apache Gobblin
>  Issue Type: Bug
>Reporter: William Lo
>Priority: Minor
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> [GOBBLIN-822]|[https://github.com/apache/incubator-gobblin/pull/2685] 
> Introduced issues with the scripts not outputting logs. Need to include 
> log4j2.xml in each service's `conf` folder



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GOBBLIN-984) Get Gobblin log outputting to work

2019-11-26 Thread Jay Sen (Jira)


[ 
https://issues.apache.org/jira/browse/GOBBLIN-984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16982766#comment-16982766
 ] 

Jay Sen commented on GOBBLIN-984:
-

Sure, I will create PR in sometime, Thanks

> Get Gobblin log outputting to work
> --
>
> Key: GOBBLIN-984
> URL: https://issues.apache.org/jira/browse/GOBBLIN-984
> Project: Apache Gobblin
>  Issue Type: Bug
>Reporter: William Lo
>Priority: Minor
>
> [GOBBLIN-822]|[https://github.com/apache/incubator-gobblin/pull/2685] 
> Introduced issues with the scripts not outputting logs. Need to include 
> log4j2.xml in each service's `conf` folder



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (GOBBLIN-984) Get Gobblin log outputting to work

2019-11-26 Thread Jay Sen (Jira)


[ 
https://issues.apache.org/jira/browse/GOBBLIN-984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16982753#comment-16982753
 ] 

Jay Sen edited comment on GOBBLIN-984 at 11/26/19 6:31 PM:
---

Hi [~wlo], I have this ready which i was using it on my local, missed to 
include those files, i can create PR, pls let me know. THanks


was (Author: jaysen):
Hi [~wlo], I have this ready which i was using it on my local, i can create PR, 
pls let me know. THanks

> Get Gobblin log outputting to work
> --
>
> Key: GOBBLIN-984
> URL: https://issues.apache.org/jira/browse/GOBBLIN-984
> Project: Apache Gobblin
>  Issue Type: Bug
>Reporter: William Lo
>Priority: Minor
>
> [GOBBLIN-822]|[https://github.com/apache/incubator-gobblin/pull/2685] 
> Introduced issues with the scripts not outputting logs. Need to include 
> log4j2.xml in each service's `conf` folder



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GOBBLIN-984) Get Gobblin log outputting to work

2019-11-26 Thread Jay Sen (Jira)


[ 
https://issues.apache.org/jira/browse/GOBBLIN-984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16982753#comment-16982753
 ] 

Jay Sen commented on GOBBLIN-984:
-

Hi [~wlo], I have this ready, i can create PR, pls let me know. THanks

> Get Gobblin log outputting to work
> --
>
> Key: GOBBLIN-984
> URL: https://issues.apache.org/jira/browse/GOBBLIN-984
> Project: Apache Gobblin
>  Issue Type: Bug
>Reporter: William Lo
>Priority: Minor
>
> [GOBBLIN-822]|[https://github.com/apache/incubator-gobblin/pull/2685] 
> Introduced issues with the scripts not outputting logs. Need to include 
> log4j2.xml in each service's `conf` folder



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (GOBBLIN-984) Get Gobblin log outputting to work

2019-11-26 Thread Jay Sen (Jira)


[ 
https://issues.apache.org/jira/browse/GOBBLIN-984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16982753#comment-16982753
 ] 

Jay Sen edited comment on GOBBLIN-984 at 11/26/19 6:30 PM:
---

Hi [~wlo], I have this ready which i was using it on my local, i can create PR, 
pls let me know. THanks


was (Author: jaysen):
Hi [~wlo], I have this ready, i can create PR, pls let me know. THanks

> Get Gobblin log outputting to work
> --
>
> Key: GOBBLIN-984
> URL: https://issues.apache.org/jira/browse/GOBBLIN-984
> Project: Apache Gobblin
>  Issue Type: Bug
>Reporter: William Lo
>Priority: Minor
>
> [GOBBLIN-822]|[https://github.com/apache/incubator-gobblin/pull/2685] 
> Introduced issues with the scripts not outputting logs. Need to include 
> log4j2.xml in each service's `conf` folder



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Issue Comment Deleted] (GOBBLIN-824) upgrade component and dependency library versions

2019-11-25 Thread Jay Sen (Jira)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Sen updated GOBBLIN-824:

Comment: was deleted

(was: GOBBLIN-818 does minor upgrades.)

> upgrade component and dependency library versions
> -
>
> Key: GOBBLIN-824
> URL: https://issues.apache.org/jira/browse/GOBBLIN-824
> Project: Apache Gobblin
>  Issue Type: Improvement
>Affects Versions: 0.14.0
>Reporter: Jay Sen
>Priority: Major
> Fix For: 0.15.0
>
>
> lot of libs are old, like hadoop, hive, etc... 
>  it wont be easy to just comile gobblin with new version via passing new 
> version on command line, there is lot of changes since last couple of years. 
>  Gobblin should use latest versions
> Hadoop: 2.9.x 
>  hive : 2.3.5
>  pegasus: 24.0.2
> Avro : 1.8.2
> etc...
> please feel free to mention which lib should be updated as part of this 
> overall upgrade process.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (GOBBLIN-824) upgrade component and dependency library versions

2019-11-25 Thread Jay Sen (Jira)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Sen updated GOBBLIN-824:

Summary: upgrade component and dependency library versions  (was: upgrade 
to latest libraries in Gobblin)

> upgrade component and dependency library versions
> -
>
> Key: GOBBLIN-824
> URL: https://issues.apache.org/jira/browse/GOBBLIN-824
> Project: Apache Gobblin
>  Issue Type: Improvement
>Affects Versions: 0.14.0
>Reporter: Jay Sen
>Priority: Major
> Fix For: 0.15.0
>
>
> lot of libs are old, like hadoop, hive, etc... 
>  it wont be easy to just comile gobblin with new version via passing new 
> version on command line, there is lot of changes since last couple of years. 
>  Gobblin should use latest versions
> Hadoop: 2.9.x 
>  hive : 2.3.5
>  pegasus: 24.0.2
> Avro : 1.8.2
> etc...
> please feel free to mention which lib should be updated as part of this 
> overall upgrade process.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (GOBBLIN-741) update influxdb-java dependency to latest (> 2.15)

2019-11-25 Thread Jay Sen (Jira)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Sen updated GOBBLIN-741:

Parent: GOBBLIN-824
Issue Type: Sub-task  (was: Improvement)

> update influxdb-java dependency to latest (> 2.15)
> --
>
> Key: GOBBLIN-741
> URL: https://issues.apache.org/jira/browse/GOBBLIN-741
> Project: Apache Gobblin
>  Issue Type: Sub-task
>Reporter: Jay Sen
>Priority: Major
>
> current influxdb-java dependency is 2.1 which was released in Dec,2015
> this can bring new functionalities of altest influxDB and new ways of testing 
> influxDB integration witihin Gobblin.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (GOBBLIN-823) upgrade or remove lombok usage

2019-11-25 Thread Jay Sen (Jira)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Sen updated GOBBLIN-823:

Parent: GOBBLIN-824
Issue Type: Sub-task  (was: Improvement)

> upgrade or remove lombok usage
> --
>
> Key: GOBBLIN-823
> URL: https://issues.apache.org/jira/browse/GOBBLIN-823
> Project: Apache Gobblin
>  Issue Type: Sub-task
>Affects Versions: 0.14.0
>Reporter: Jay Sen
>Priority: Major
> Fix For: 0.15.0
>
>
> Lombok is useful once developer gets used to it, but it has itss own learning 
> curve for new user, given lombok adds unnecessary complexity, mot of the 
> things can be taken care by smart IDE anyway, I suggest we remove use of 
> lomboke.
> if not, lets at least upgrade to latest version to get all bug fixes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (GOBBLIN-979) upgrade Hive version

2019-11-25 Thread Jay Sen (Jira)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Sen updated GOBBLIN-979:

Parent: GOBBLIN-824
Issue Type: Sub-task  (was: Bug)

> upgrade Hive version
> 
>
> Key: GOBBLIN-979
> URL: https://issues.apache.org/jira/browse/GOBBLIN-979
> Project: Apache Gobblin
>  Issue Type: Sub-task
>  Components: hive-registration
>Affects Versions: 0.14.0
>Reporter: Jay Sen
>Priority: Major
> Fix For: 0.15.0
>
>
> Gobblin uses old hive 1.0, compiling against Hive 1.2 is not compatible due 
> to backward incompatible changes in Hive 1.2
> Also there are many changes in Hive 2.x and 3.x, that we need to think about 
> on how to provide the support for newer Hive versions in Gobblin



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (GOBBLIN-818) upgrade default hadoop versions to 2.7.x

2019-11-25 Thread Jay Sen (Jira)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Sen updated GOBBLIN-818:

Parent: GOBBLIN-824
Issue Type: Sub-task  (was: Improvement)

> upgrade default hadoop versions to 2.7.x
> 
>
> Key: GOBBLIN-818
> URL: https://issues.apache.org/jira/browse/GOBBLIN-818
> Project: Apache Gobblin
>  Issue Type: Sub-task
>Reporter: Jay Sen
>Priority: Major
>
> Gobblin should also upgrade Hadoop from 2.3 to 2.7.x at least, which is very 
> stable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (GOBBLIN-979) upgrade Hive version

2019-11-25 Thread Jay Sen (Jira)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Sen updated GOBBLIN-979:

Parent: (was: GOBBLIN-818)
Issue Type: Bug  (was: Sub-task)

> upgrade Hive version
> 
>
> Key: GOBBLIN-979
> URL: https://issues.apache.org/jira/browse/GOBBLIN-979
> Project: Apache Gobblin
>  Issue Type: Bug
>  Components: hive-registration
>Affects Versions: 0.14.0
>Reporter: Jay Sen
>Priority: Major
> Fix For: 0.15.0
>
>
> Gobblin uses old hive 1.0, compiling against Hive 1.2 is not compatible due 
> to backward incompatible changes in Hive 1.2
> Also there are many changes in Hive 2.x and 3.x, that we need to think about 
> on how to provide the support for newer Hive versions in Gobblin



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (GOBBLIN-979) upgrade Hive version

2019-11-25 Thread Jay Sen (Jira)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Sen updated GOBBLIN-979:

Parent: GOBBLIN-818
Issue Type: Sub-task  (was: Bug)

> upgrade Hive version
> 
>
> Key: GOBBLIN-979
> URL: https://issues.apache.org/jira/browse/GOBBLIN-979
> Project: Apache Gobblin
>  Issue Type: Sub-task
>  Components: hive-registration
>Affects Versions: 0.14.0
>Reporter: Jay Sen
>Priority: Major
> Fix For: 0.15.0
>
>
> Gobblin uses old hive 1.0, compiling against Hive 1.2 is not compatible due 
> to backward incompatible changes in Hive 1.2
> Also there are many changes in Hive 2.x and 3.x, that we need to think about 
> on how to provide the support for newer Hive versions in Gobblin



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (GOBBLIN-979) upgrade Hive version

2019-11-25 Thread Jay Sen (Jira)
Jay Sen created GOBBLIN-979:
---

 Summary: upgrade Hive version
 Key: GOBBLIN-979
 URL: https://issues.apache.org/jira/browse/GOBBLIN-979
 Project: Apache Gobblin
  Issue Type: Bug
Reporter: Jay Sen


Gobblin uses old hive 1.0, compiling against Hive 1.2 is not compatible due to 
backward incompatible changes in Hive 1.2

Also there are many changes in Hive 2.x and 3.x, that we need to think about on 
how to provide the support for newer Hive versions in Gobblin



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (GOBBLIN-979) upgrade Hive version

2019-11-25 Thread Jay Sen (Jira)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Sen updated GOBBLIN-979:

  Component/s: hive-registration
Fix Version/s: 0.15.0
Affects Version/s: 0.14.0

> upgrade Hive version
> 
>
> Key: GOBBLIN-979
> URL: https://issues.apache.org/jira/browse/GOBBLIN-979
> Project: Apache Gobblin
>  Issue Type: Bug
>  Components: hive-registration
>Affects Versions: 0.14.0
>Reporter: Jay Sen
>Priority: Major
> Fix For: 0.15.0
>
>
> Gobblin uses old hive 1.0, compiling against Hive 1.2 is not compatible due 
> to backward incompatible changes in Hive 1.2
> Also there are many changes in Hive 2.x and 3.x, that we need to think about 
> on how to provide the support for newer Hive versions in Gobblin



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (GOBBLIN-818) upgrade default hadoop versions to 2.7.x

2019-11-25 Thread Jay Sen (Jira)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Sen updated GOBBLIN-818:

Description: Gobblin should also upgrade Hadoop from 2.3 to 2.7.x at least, 
which is very stable.  (was: Gobblin uses old hive 1.0, compiling against Hive 
1.2 is not compatible due to backward incompatible changes in Hive 1.2.

Gobblin should also upgrade Hadoop from 2.3 to 2.7.7 at least which is very 
stable. )

> upgrade default hadoop versions to 2.7.x
> 
>
> Key: GOBBLIN-818
> URL: https://issues.apache.org/jira/browse/GOBBLIN-818
> Project: Apache Gobblin
>  Issue Type: Improvement
>Reporter: Jay Sen
>Priority: Major
>
> Gobblin should also upgrade Hadoop from 2.3 to 2.7.x at least, which is very 
> stable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (GOBBLIN-818) upgrade default hadoop versions to 2.7.x

2019-11-25 Thread Jay Sen (Jira)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Sen updated GOBBLIN-818:

Summary: upgrade default hadoop versions to 2.7.x  (was: upgrade default 
hadoop versions to 2.7.x and hive version to 1.2)

> upgrade default hadoop versions to 2.7.x
> 
>
> Key: GOBBLIN-818
> URL: https://issues.apache.org/jira/browse/GOBBLIN-818
> Project: Apache Gobblin
>  Issue Type: Improvement
>Reporter: Jay Sen
>Priority: Major
>
> Gobblin uses old hive 1.0, compiling against Hive 1.2 is not compatible due 
> to backward incompatible changes in Hive 1.2.
> Gobblin should also upgrade Hadoop from 2.3 to 2.7.7 at least which is very 
> stable. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (GOBBLIN-959) gobblin-http module fails to compile due to missing dependency

2019-11-12 Thread Jay Sen (Jira)
Jay Sen created GOBBLIN-959:
---

 Summary: gobblin-http module fails to compile due to missing 
dependency
 Key: GOBBLIN-959
 URL: https://issues.apache.org/jira/browse/GOBBLIN-959
 Project: Apache Gobblin
  Issue Type: Bug
Affects Versions: 0.14.0
Reporter: Jay Sen
 Fix For: 0.15.0


{{you can see folllowing error while compiling this module }}

{{./gradlew :gobblin-module:gobblin-http:build -x findbugsMain -x rat -x 
checkstyleMain}}

{{> Task :gobblin-modules:gobblin-http:compileTestJava FAILED}}
{{/Users/jsenjaliya/src/apache/gobblin/gobblin-modules/gobblin-http/src/test/java/org/apache/gobblin/util/HttpUtilsTest.java:28:
 error: package junit.framework does not exist}}
{{import junit.framework.Assert;}}
{{ ^}}

 

{{its basically missing the junit dependency in gradle file}}

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (GOBBLIN-952) PathAlerationListner needs to picks up only modified job

2019-11-08 Thread Jay Sen (Jira)
Jay Sen created GOBBLIN-952:
---

 Summary: PathAlerationListner needs to picks up only modified job
 Key: GOBBLIN-952
 URL: https://issues.apache.org/jira/browse/GOBBLIN-952
 Project: Apache Gobblin
  Issue Type: Bug
Affects Versions: 0.14.0
Reporter: Jay Sen
 Fix For: 0.15.0


{{PathAlterationListenerAdaptorForMonitor}} detects any changes to the job 
config path, which is the required functionality but it also literally tried to 
schedule or unschedule all the changes. 

In case the job runs only once (by config), it will move it to .done file which 
it then tries to unschedule it.

May be the monitor needs additional logic to ignore some changes like this.

here is the log

{code:java}
22:57:38.803 INFO: [TaskExecutor STOPPING-97] 
[org.apache.gobblin.runtime.TaskExecutor] TaskExecutor - Successfully shutdown 
ExecutorService: 
com.google.common.util.concurrent.MoreExecutors$ScheduledListeningDecorator@6b5a367e
 22:57:38.803 INFO: [TaskExecutor STOPPING-97] 
[org.apache.gobblin.runtime.TaskExecutor] TaskExecutor - Attempting to shutdown 
ExecutorService: 
com.google.common.util.concurrent.MoreExecutors$ListeningDecorator@659dd565
 22:57:38.803 INFO: [TaskExecutor STOPPING-97] 
[org.apache.gobblin.runtime.TaskExecutor] TaskExecutor - Successfully shutdown 
ExecutorService: 
com.google.common.util.concurrent.MoreExecutors$ListeningDecorator@659dd565
 22:57:38.803 INFO: [LocalTaskStateTracker STOPPING-98] 
[org.apache.gobblin.runtime.local.LocalTaskStateTracker] LocalTaskStateTracker 
- Stopping the task state tracker
 22:57:38.803 INFO: [LocalTaskStateTracker STOPPING-98] 
[org.apache.gobblin.runtime.local.LocalTaskStateTracker] LocalTaskStateTracker 
- Attempting to shutdown ExecutorService: 
com.google.common.util.concurrent.MoreExecutors$ScheduledListeningDecorator@5faf4e52
 22:57:38.804 INFO: [LocalTaskStateTracker STOPPING-98] 
[org.apache.gobblin.runtime.local.LocalTaskStateTracker] LocalTaskStateTracker 
- Successfully shutdown ExecutorService: 
com.google.common.util.concurrent.MoreExecutors$ScheduledListeningDecorator@5faf4e52
 22:57:38.804 INFO: [JobScheduler-2-70] 
[org.apache.gobblin.source.extractor.filebased.FileBasedSource] FileBasedSource 
- Shutting down the FileSystemHelper connection
 22:57:56.476 INFO: [newDaemonThreadFactory-26] 
[org.apache.gobblin.scheduler.PathAlterationListenerAdaptorForMonitor] 
PathAlterationListenerAdaptorForMonitor - Detected deletion of job 
configuration file 
file:/home/jsenjaliya/gobblin-dist/gobblin-jobs/local-aerospike-avro.pull
 22:57:56.476 INFO: [newDaemonThreadFactory-26] 
[org.apache.gobblin.scheduler.PathAlterationListenerAdaptorForMonitor] 
PathAlterationListenerAdaptorForMonitor - Could not find a scheduled job to 
unschedule with path 
/home/jsenjaliya/gobblin-dist/gobblin-jobs/local-aerospike-avro.pull
{code}
 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (GOBBLIN-949) add dry run in gobblin.sh script

2019-11-06 Thread Jay Sen (Jira)
Jay Sen created GOBBLIN-949:
---

 Summary: add dry run in gobblin.sh script
 Key: GOBBLIN-949
 URL: https://issues.apache.org/jira/browse/GOBBLIN-949
 Project: Apache Gobblin
  Issue Type: Bug
Affects Versions: 0.14.0
Reporter: Jay Sen
 Fix For: 0.15.0


many times we may just need to see what is going to be executed without 
actually executing it., i.e. dry run. which also helps testing the script 
itself.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (GOBBLIN-943) remove pid file only when it exists

2019-11-01 Thread Jay Sen (Jira)
Jay Sen created GOBBLIN-943:
---

 Summary: remove pid file only when it exists
 Key: GOBBLIN-943
 URL: https://issues.apache.org/jira/browse/GOBBLIN-943
 Project: Apache Gobblin
  Issue Type: Sub-task
Reporter: Jay Sen






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (GOBBLIN-942) use HOCON for job configurations instead of Java Properties

2019-10-31 Thread Jay Sen (Jira)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Sen updated GOBBLIN-942:

Summary: use HOCON for job configurations instead of Java Properties  (was: 
HOCON vs Java Prop for job configurations)

> use HOCON for job configurations instead of Java Properties
> ---
>
> Key: GOBBLIN-942
> URL: https://issues.apache.org/jira/browse/GOBBLIN-942
> Project: Apache Gobblin
>  Issue Type: Sub-task
>Reporter: Jay Sen
>Priority: Major
>
> currently {{PullFileLoader}} uses java property loader for *.pull & *.job 
> files and uses hocon loader for *.json & *.conf files.
> This introduces lot of inconsistencies among how platform config gets treated 
> vs job config. I guess previously we used the \{env:***} format for using the 
> env variables but with GOBBLIN-939 it makes it HOCON compatible.
> the current issue with treating job config file as java property file is that 
> it addes quote around the already quoted string which HOCON handles well.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (GOBBLIN-942) HOCON vs Java Prop for job configurations

2019-10-31 Thread Jay Sen (Jira)
Jay Sen created GOBBLIN-942:
---

 Summary: HOCON vs Java Prop for job configurations
 Key: GOBBLIN-942
 URL: https://issues.apache.org/jira/browse/GOBBLIN-942
 Project: Apache Gobblin
  Issue Type: Sub-task
Reporter: Jay Sen


currently {{PullFileLoader}} uses java property loader for *.pull & *.job files 
and uses hocon loader for *.json & *.conf files.

This introduces lot of inconsistencies among how platform config gets treated 
vs job config. I guess previously we used the \{env:***} format for using the 
env variables but with GOBBLIN-939 it makes it HOCON compatible.

the current issue with treating job config file as java property file is that 
it addes quote around the already quoted string which HOCON handles well.

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (GOBBLIN-939) Integrate usage of env variables in gobblin scripts and configs

2019-10-31 Thread Jay Sen (Jira)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Sen updated GOBBLIN-939:

Description: 
1. standardize config with ENV variables 

2. define default env variables in gobblin-env.sh

  was:standardize the ability to override gobblin variables via env variable 
with new gobblin.sh script.


> Integrate usage of env variables in gobblin scripts and configs
> ---
>
> Key: GOBBLIN-939
> URL: https://issues.apache.org/jira/browse/GOBBLIN-939
> Project: Apache Gobblin
>  Issue Type: Improvement
>Affects Versions: 0.14.0
>Reporter: Jay Sen
>Priority: Major
> Fix For: 0.15.0
>
>
> 1. standardize config with ENV variables 
> 2. define default env variables in gobblin-env.sh



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (GOBBLIN-939) Integrate usage of env variables in gobblin scripts and configs

2019-10-31 Thread Jay Sen (Jira)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Sen updated GOBBLIN-939:

Summary: Integrate usage of env variables in gobblin scripts and configs  
(was: standardize the ability to override gobblin variables via env variable 
with new gobblin.sh script)

> Integrate usage of env variables in gobblin scripts and configs
> ---
>
> Key: GOBBLIN-939
> URL: https://issues.apache.org/jira/browse/GOBBLIN-939
> Project: Apache Gobblin
>  Issue Type: Improvement
>Affects Versions: 0.14.0
>Reporter: Jay Sen
>Priority: Major
> Fix For: 0.15.0
>
>
> standardize the ability to override gobblin variables via env variable with 
> new gobblin.sh script.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (GOBBLIN-939) standardize the ability to override gobblin variables via env variable with new gobblin.sh script

2019-10-31 Thread Jay Sen (Jira)
Jay Sen created GOBBLIN-939:
---

 Summary: standardize the ability to override gobblin variables via 
env variable with new gobblin.sh script
 Key: GOBBLIN-939
 URL: https://issues.apache.org/jira/browse/GOBBLIN-939
 Project: Apache Gobblin
  Issue Type: Improvement
Affects Versions: 0.14.0
Reporter: Jay Sen
 Fix For: 0.15.0


standardize the ability to override gobblin variables via env variable with new 
gobblin.sh script.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (GOBBLIN-937) fix help text and align it with variable names

2019-10-30 Thread Jay Sen (Jira)
Jay Sen created GOBBLIN-937:
---

 Summary: fix help text and align it with variable names
 Key: GOBBLIN-937
 URL: https://issues.apache.org/jira/browse/GOBBLIN-937
 Project: Apache Gobblin
  Issue Type: Sub-task
Affects Versions: 0.14.0
Reporter: Jay Sen
 Fix For: 0.15.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (GOBBLIN-936) check id job disabled first before parsing and scheduling job

2019-10-30 Thread Jay Sen (Jira)
Jay Sen created GOBBLIN-936:
---

 Summary: check id job disabled first before parsing and scheduling 
job
 Key: GOBBLIN-936
 URL: https://issues.apache.org/jira/browse/GOBBLIN-936
 Project: Apache Gobblin
  Issue Type: Improvement
  Components: gobblin-core
Affects Versions: 0.14.0
Reporter: Jay Sen
Assignee: Abhishek Tiwari
 Fix For: 0.15.0


if job is disabled, Gobblin should not even try to parse the config and do lot 
of stuff.

currently it checks this at very end of scheduling job it should be first thing 
to check and skip its disabled.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (GOBBLIN-935) reloading job config throws NPE

2019-10-30 Thread Jay Sen (Jira)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Sen closed GOBBLIN-935.
---
Resolution: Duplicate

> reloading job config throws NPE 
> 
>
> Key: GOBBLIN-935
> URL: https://issues.apache.org/jira/browse/GOBBLIN-935
> Project: Apache Gobblin
>  Issue Type: Improvement
>  Components: gobblin-core
>Affects Versions: 0.14.0
>Reporter: Jay Sen
>Assignee: Abhishek Tiwari
>Priority: Major
> Fix For: 0.15.0
>
>
> steps to reproduce
>  # run the gobblin in standalone ( i used standalone more but all other mode 
> should be same)
>  # wait for any job to finish successfully or unsuccessfully.
>  # rename the completed job from .pull.done to .pull OR 
> make any changes to the existing .pull file ( like adding space )
>  # the job will be picked up by the observer and will throw NPE on 
> {{SchedulerUtils}}.{{resolveTemplate}}
> This is due to the NULL JobSpecResolver. which happens only in case of the 
> reloading of the job config and this does not happen in regular flow since it 
> takes different code path.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GOBBLIN-935) reloading job config throws NPE

2019-10-30 Thread Jay Sen (Jira)


[ 
https://issues.apache.org/jira/browse/GOBBLIN-935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16963199#comment-16963199
 ] 

Jay Sen commented on GOBBLIN-935:
-

I just saw, once i rebased the repo, that @William Lo beat me on this bug to 
get to it first. Thanks @william Lo for fixing this.

> reloading job config throws NPE 
> 
>
> Key: GOBBLIN-935
> URL: https://issues.apache.org/jira/browse/GOBBLIN-935
> Project: Apache Gobblin
>  Issue Type: Improvement
>  Components: gobblin-core
>Affects Versions: 0.14.0
>Reporter: Jay Sen
>Assignee: Abhishek Tiwari
>Priority: Major
> Fix For: 0.15.0
>
>
> steps to reproduce
>  # run the gobblin in standalone ( i used standalone more but all other mode 
> should be same)
>  # wait for any job to finish successfully or unsuccessfully.
>  # rename the completed job from .pull.done to .pull OR 
> make any changes to the existing .pull file ( like adding space )
>  # the job will be picked up by the observer and will throw NPE on 
> {{SchedulerUtils}}.{{resolveTemplate}}
> This is due to the NULL JobSpecResolver. which happens only in case of the 
> reloading of the job config and this does not happen in regular flow since it 
> takes different code path.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (GOBBLIN-935) reloading job config throws NPE

2019-10-30 Thread Jay Sen (Jira)
Jay Sen created GOBBLIN-935:
---

 Summary: reloading job config throws NPE 
 Key: GOBBLIN-935
 URL: https://issues.apache.org/jira/browse/GOBBLIN-935
 Project: Apache Gobblin
  Issue Type: Improvement
  Components: gobblin-core
Affects Versions: 0.14.0
Reporter: Jay Sen
Assignee: Abhishek Tiwari
 Fix For: 0.15.0


steps to reproduce
 # run the gobblin in standalone ( i used standalone more but all other mode 
should be same)
 # wait for any job to finish successfully or unsuccessfully.
 # rename the completed job from .pull.done to .pull OR 
make any changes to the existing .pull file ( like adding space )
 # the job will be picked up by the observer and will throw NPE on 
{{SchedulerUtils}}.{{resolveTemplate}}

This is due to the NULL JobSpecResolver. which happens only in case of the 
reloading of the job config and this does not happen in regular flow since it 
takes different code path.
 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (GOBBLIN-934) scheduling job logic is broken in PathAlterationListenerAdaptorForMonitor

2019-10-29 Thread Jay Sen (Jira)
Jay Sen created GOBBLIN-934:
---

 Summary: scheduling job logic is broken in 
PathAlterationListenerAdaptorForMonitor
 Key: GOBBLIN-934
 URL: https://issues.apache.org/jira/browse/GOBBLIN-934
 Project: Apache Gobblin
  Issue Type: Improvement
  Components: gobblin-core
Affects Versions: 0.14.0
Reporter: Jay Sen
Assignee: Abhishek Tiwari
 Fix For: 0.15.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (GOBBLIN-929) create ZookeeperCommitSequenceStore

2019-10-26 Thread Jay Sen (Jira)
Jay Sen created GOBBLIN-929:
---

 Summary: create ZookeeperCommitSequenceStore
 Key: GOBBLIN-929
 URL: https://issues.apache.org/jira/browse/GOBBLIN-929
 Project: Apache Gobblin
  Issue Type: Improvement
  Components: gobblin-api, gobblin-core
Affects Versions: 0.14.0
Reporter: Jay Sen
Assignee: Hung Tran
 Fix For: 0.15.0


currently for {{EXACTLY_ONCE}} semantics, only file system based 
{{FsCommitSequenceStore}} is available, which can be slow for streaming 
use-cases. 
should implement ZK based checkpointing for faster service.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (GOBBLIN-927) config to properties fails to convert configObject to string

2019-10-25 Thread Jay Sen (Jira)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Sen updated GOBBLIN-927:

Description: 
if you specify source.schema = \{\} then 
{{ConfigUtils.configToProperties}} would fail to convert this object directly 
to string.
solution is to use convert any possible objects to string.

  was:
if you specify source.schema = { } then 
{{ConfigUtils.configToProperties}} would fail to convert this object directly 
to string.
solution is to use convert any possible objects to string.


> config to properties fails to convert configObject to string
> 
>
> Key: GOBBLIN-927
> URL: https://issues.apache.org/jira/browse/GOBBLIN-927
> Project: Apache Gobblin
>  Issue Type: Bug
>Reporter: Jay Sen
>Priority: Minor
>
> if you specify source.schema = \{\} then 
> {{ConfigUtils.configToProperties}} would fail to convert this object directly 
> to string.
> solution is to use convert any possible objects to string.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (GOBBLIN-927) config to properties fails to convert configObject to string

2019-10-25 Thread Jay Sen (Jira)
Jay Sen created GOBBLIN-927:
---

 Summary: config to properties fails to convert configObject to 
string
 Key: GOBBLIN-927
 URL: https://issues.apache.org/jira/browse/GOBBLIN-927
 Project: Apache Gobblin
  Issue Type: Bug
Reporter: Jay Sen


if you specify source.schema = { } then 
{{ConfigUtils.configToProperties}} would fail to convert this object directly 
to string.
solution is to use convert any possible objects to string.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (GOBBLIN-926) use typesafe config consistently everywhere

2019-10-25 Thread Jay Sen (Jira)
Jay Sen created GOBBLIN-926:
---

 Summary: use typesafe config consistently everywhere
 Key: GOBBLIN-926
 URL: https://issues.apache.org/jira/browse/GOBBLIN-926
 Project: Apache Gobblin
  Issue Type: Improvement
Affects Versions: 0.14.0
Reporter: Jay Sen
 Fix For: 0.15.0


basically remove the usage and conversion to java.utils.properties.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (GOBBLIN-920) simple json example is missing Apache Commons VFS dep.

2019-10-22 Thread Jay Sen (Jira)
Jay Sen created GOBBLIN-920:
---

 Summary: simple json example is missing Apache Commons VFS dep.
 Key: GOBBLIN-920
 URL: https://issues.apache.org/jira/browse/GOBBLIN-920
 Project: Apache Gobblin
  Issue Type: Bug
Reporter: Jay Sen


when you run simple json example from gobblin-example module in mapreduce mode, 
it reports error of missing apache common vfs dependency.
solution is to include the dependency in gobblin-example so that it created in 
lib to be used when running the example.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (GOBBLIN-901) add some info logging to FsDataWriter

2019-10-03 Thread Jay Sen (Jira)
Jay Sen created GOBBLIN-901:
---

 Summary: add some info logging to FsDataWriter
 Key: GOBBLIN-901
 URL: https://issues.apache.org/jira/browse/GOBBLIN-901
 Project: Apache Gobblin
  Issue Type: Bug
  Components: gobblin-core
Affects Versions: 0.14.0
Reporter: Jay Sen
Assignee: Abhishek Tiwari
 Fix For: 0.15.0


When moving the staging data to final output dir, sometime it fails due to 
various reasons and its hard to debug without proper info. 
adding some more logging would help debug the issue, specially for new user.




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (GOBBLIN-900) FsDataWriter dont use writer config for FileSystem

2019-10-03 Thread Jay Sen (Jira)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Sen closed GOBBLIN-900.
---
Resolution: Duplicate

> FsDataWriter dont use writer config for FileSystem
> --
>
> Key: GOBBLIN-900
> URL: https://issues.apache.org/jira/browse/GOBBLIN-900
> Project: Apache Gobblin
>  Issue Type: Bug
>  Components: gobblin-core
>Affects Versions: 0.14.0
>Reporter: Jay Sen
>Assignee: Abhishek Tiwari
>Priority: Major
> Fix For: 0.15.0
>
>
> FsDataWriter.Java:107
> {{this.fileContext = FileContext.getFileContext(conf);}} 
> FileContext is used while moving data from staging to final output path. 
> currently it uses default hadoop Configuration which can be picked up by 
> local hadoop config in classpath, which is not right. the Writer fiulesystem 
> should be always driven by the writer config mentioned as per job or platform 
> config.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GOBBLIN-900) FsDataWriter dont use writer config for FileSystem

2019-10-03 Thread Jay Sen (Jira)


[ 
https://issues.apache.org/jira/browse/GOBBLIN-900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16944156#comment-16944156
 ] 

Jay Sen commented on GOBBLIN-900:
-

Looks like GOBBLIN-889 was filed for the same but solves bit differently.
my fix was following one line change
{{this.fileContext = FileContext.getFileContext(this.fs.getUri());}}

{{this.fs}} is already above this code line which is basically the FileSystem 
object so we can directly get the uri from there.  



> FsDataWriter dont use writer config for FileSystem
> --
>
> Key: GOBBLIN-900
> URL: https://issues.apache.org/jira/browse/GOBBLIN-900
> Project: Apache Gobblin
>  Issue Type: Bug
>  Components: gobblin-core
>Affects Versions: 0.14.0
>Reporter: Jay Sen
>Assignee: Abhishek Tiwari
>Priority: Major
> Fix For: 0.15.0
>
>
> FsDataWriter.Java:107
> {{this.fileContext = FileContext.getFileContext(conf);}} 
> FileContext is used while moving data from staging to final output path. 
> currently it uses default hadoop Configuration which can be picked up by 
> local hadoop config in classpath, which is not right. the Writer fiulesystem 
> should be always driven by the writer config mentioned as per job or platform 
> config.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (GOBBLIN-898) successful Job completion wrongly calls onJobFailure

2019-10-03 Thread Jay Sen (Jira)
Jay Sen created GOBBLIN-898:
---

 Summary: successful Job completion wrongly calls onJobFailure
 Key: GOBBLIN-898
 URL: https://issues.apache.org/jira/browse/GOBBLIN-898
 Project: Apache Gobblin
  Issue Type: Bug
Reporter: Jay Sen


we came across this code snippet which registers jobListner for 
JobCompleteTimer and   JobSucceededTimer event and looks like JobSucceededTimer 
event calls onJobFailure function instead of onJobSucced function.

notifyListeners(this.jobContext, jobListener, 
TimingEvent.LauncherTimings.JOB_SUCCEEDED, new JobListenerAction() {
  @Override
  public void apply(JobListener jobListener, JobContext jobContext)
  throws Exception {
jobListener.onJobFailure(jobContext);
  }
});

Qinghe from our team found this, so reporting on behalf of him here



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (GOBBLIN-867) use logger from class instead of passing it around

2019-09-05 Thread Jay Sen (Jira)
Jay Sen created GOBBLIN-867:
---

 Summary: use logger from class instead of passing it around
 Key: GOBBLIN-867
 URL: https://issues.apache.org/jira/browse/GOBBLIN-867
 Project: Apache Gobblin
  Issue Type: Improvement
Reporter: Jay Sen


in some classes, mainly static methods, expects the logger, so user of the 
methods has to pass the logger, these static methods are utils anyway, should 
just have their own logger.




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (GOBBLIN-707) combine & standardize all gobblin scripts into one master script & restructure configs accordingly

2019-09-03 Thread Jay Sen (Jira)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Sen updated GOBBLIN-707:

Description: 
gobblin supports multiple modes of executions ( CLI, Standalone, 
cluster-master, cluster-worker, AWS, YARN, MR ) and various command lines 
utility to run cli and admin commands. The problem is each cli and execution 
mode has individual script to manage the service, which brings following 
problems.

Having individual script introduces lot of issues
 # all scripts handles gobblin variables, user parameters differently, and its 
highly inconsistent among various different gobblin scripts, not to mention 
different features supported by different scripts.
 # functionality around start, stop, status checking and handling PID's among 
lot of other things, varies vastly as per the implementation of the script.
 # features like GC & JVM params, log4j file selection, classpath calculation, 
etc... exists in some gobblin scripts but not all, adding to inconsistent user 
experience.
# code duplication: all the gobblin scripts share lot of common code to handle 
params, start, stop services, status checks, pid handling, etc... combining all 
the scripts into 1 not only makes maintenance easier but also brings clarity 
and consistency. 
# Basically, current 13 different scripts adds confusion to new user on how to 
use Gobblin or how to use it.


Solution:

1. there can be one gobblin.sh script to handle all gobblin commands and 
deployment options as per following signature. NOTE: This

{{gobblin.sh   }}
 {{gobblin.sh   }}

{{commands values: admin, cli, statestore-check, statestore-clean, 
historystore-manager, classpath}}
 {{service values: standalone, cluster-master, cluster-worker, aws, yarn, mr, 
service}}

with above change, following becomes valid command.
{code:java}
# all under GobblinCli class
gobblin run listQuickApps  –> gobblin cli run listQuickApps 
gobblin run  -> gobblin cli run  

# class: JobStateToJsonConverter
statestore-checker.sh  -> gobblin cli job-state-to-json 

# class: StateStoreCleaner
statestore-clean.sh  -> the class is depricated so no need to migrate 
this over.

# class: DatabaseJobHistoryStoreSchemaManager
historystore-manager.sh  -> gobblin cli job-store-schema-manager 

# class: Cli
gobblin-admin.sh-> gobblin cli admin 

# all gobblin deployment modes
gobblin-cluster-master.sh   -> gobblin service cluster-master start|stop|status
gobblin-cluster-worker.sh   -> gobblin service cluster-worker start|stop|status
gobblin-compaction.sh   -> gobblin-compaction.sh  ( kept as it is for now, 
can be migrated to new script framework)
gobblin-mapreduce.sh-> gobblin service mapreduce start|stop|status
gobblin-service.sh   -> gobblin service service-manager 
start|stop|status
gobblin-standalone.sh-> gobblin service standalone start|stop|status
gobblin-yarn.sh   -> gobblin service yarn start|stop|status
{code}
 

2. Also all configurations for each mode needs to be structured and de-duped 
accordingly to make it clear on which config will be picked up for which 
execution mode. This would be well defined in command help instructions.

 {color:#ff}
 NOTE: this refactoring adds all cli and service commands to gobblin.sh and 
hence changes the syntax for all commands and services.{color}

  was:
gobblin supports multiple modes of executions ( CLI, Standalone, 
cluster-master, cluster-worker, AWS, YARN, MR ) and various command lines 
utility to run cli and admin commands. The problem is each cli and execution 
mode has individual script to manage the service, which brings following 
problems.

Having individual script introduces lot of issues
 # all scripts handles gobblin variables, user parameters differently, and its 
highly inconsistent among various different gobblin scripts, not to mention 
different features supported by different scripts.
 # functionality around start, stop, status checking and handling PID's among 
lot of other things, varies vastly as per the implementation of the script.
 # features like GC & JVM params, log4j file selection, classpath calculation, 
etc... exists in some gobblin scripts but not all, adding to inconsistent user 
experience.
# code duplication: all the gobblin scripts share lot of common code to handle 
params, start, stop services, status checks, pid handling, etc... combining all 
the scripts into 1 not only makes maintenance easier but also brings clarity 
and consistency. 
# Basically, current 13 different scripts adds confusion to new user on how to 
use Gobblin or how to use it.


Solution:

1. there can be one gobblin.sh script to handle all gobblin commands and 
deployment options as per following signature. NOTE: This

{{gobblin.sh   }}
 {{gobblin.sh   }}

{{commands values: admin, cli, statestore-check, statestore-clean, 
historystore-manager, classpath}}
 {{service values: standalone, 

[jira] [Updated] (GOBBLIN-707) combine & standardize all gobblin scripts into one master script & restructure configs accordingly

2019-09-03 Thread Jay Sen (Jira)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Sen updated GOBBLIN-707:

Description: 
gobblin supports multiple modes of executions ( CLI, Standalone, 
cluster-master, cluster-worker, AWS, YARN, MR ) and various command lines 
utility to run cli and admin commands. The problem is each cli and execution 
mode has individual script to manage the service, which brings following 
problems.

Having individual script introduces lot of issues
 # all scripts handles gobblin variables, user parameters differently, and its 
highly inconsistent among various different gobblin scripts, not to mention 
different features supported by different scripts.
 # functionality around start, stop, status checking and handling PID's among 
lot of other things, varies vastly as per the implementation of the script.
 # features like GC & JVM params, log4j file selection, classpath calculation, 
etc... exists in some gobblin scripts but not all, adding to inconsistent user 
experience.
# code duplication: all the gobblin scripts share lot of common code to handle 
params, start, stop services, status checks, pid handling, etc... combining all 
the scripts into 1 not only makes maintenance easier but also brings clarity 
and consistency. 
# Basically, current 13 different scripts adds confusion to new user on how to 
use Gobblin or how to use it.


Solution:

1. there can be one gobblin.sh script to handle all gobblin commands and 
deployment options as per following signature. NOTE: This

{{gobblin.sh   }}
 {{gobblin.sh   }}

{{commands values: admin, cli, statestore-check, statestore-clean, 
historystore-manager, classpath}}
 {{service values: standalone, cluster-master, cluster-worker, aws, yarn, mr, 
service}}

with above change, following becomes valid command.
{code:java}
# all under GobblinCli class
gobblin run listQuickApps  –> gobblin cli run listQuickApps
gobblin run listQuickApps  –> gobblin cli run listQuickApps
gobblin run  -> gobblin cli run 

# class: JobStateToJsonConverter
statestore-checker.sh  -> gobblin statestore-checker 

# class: StateStoreCleaner
statestore-clean.sh  -> gobblin statestore-clean 

# class: DatabaseJobHistoryStoreSchemaManager
historystore-manager.sh  -> gobblin historystore-manager 

# class: Cli
gobblin-admin.sh-> gobblin admin 

# all gobblin deployment modes
gobblin-cluster-master.sh   -> gobblin cluster-mater start|stop|status
gobblin-cluster-worker.sh   -> gobblin cluster-mater start|stop|status
gobblin-compaction.sh   -> gobblin cluster-mater start|stop|status
gobblin-env.sh  -> gobblin cluster-mater start|stop|status
gobblin-mapreduce.sh-> gobblin cluster-mater start|stop|status
gobblin-service.sh  -> gobblin cluster-mater start|stop|status
gobblin-standalone.sh   -> gobblin cluster-mater start|stop|status
gobblin-yarn.sh -> gobblin cluster-mater start|stop|status
{code}
 

2. Also configs needs to be structured and deduped accordingly to make it clear 
on which config will be picked up for which execution mode.

 {color:#ff}
 NOTE: this refactoring adds all cli and service commands to gobblin.sh and 
hence changes the syntax for all commands and services.{color}

  was:
gobblin supports multiple modes of executions ( CLI, Standalone, 
cluster-master, cluster-worker, AWS, YARN, MR ) and various command lines 
utility to run cli and admin commands. There is a individual script for each of 
them.

Having individual script introduces lot of issues
 # all scripts handles gobblin variables, user parameters differently, and its 
highly inconsistent among various different gobblin scripts
 # functionality around start, stop, status checking and handling PID's among 
lot of other things, varies vastly as per the implementation of the script.
 # features like GC & JVM params, log4j file selection, classpath calculation, 
etc... exists in some gobblin scripts but not all, adding to inconsistent user 
experience.
 # maintaining total 13 script would be too much effort.

Also all the gobblin scripts share lot of common code to handle params, start, 
stop services, status checks, pid handling, etc... combining all the scripts 
into  1 not only makes maintenance easier but also brings clarity and 
consistency.

 

Solution:

1. there can be one gobblin.sh script to handle all gobblin commands and 
deployment options as per following signature. NOTE: This

{{gobblin.sh   }}
 {{gobblin.sh   }}

{{commands values: admin, cli, statestore-check, statestore-clean, 
historystore-manager, classpath}}
 {{service values: standalone, cluster-master, cluster-worker, aws, yarn, mr, 
service}}

with above change, following becomes valid command.
{code:java}
# all under GobblinCli class
gobblin run listQuickApps  –> gobblin cli run listQuickApps
gobblin run listQuickApps  –> gobblin cli run listQuickApps
gobblin run  -> gobblin cli run 

# 

[jira] [Updated] (GOBBLIN-818) upgrade default hadoop versions to 2.7.x and hive version to 1.2

2019-09-02 Thread Jay Sen (Jira)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Sen updated GOBBLIN-818:

Description: 
Gobblin uses old hive 1.0, compiling against Hive 1.2 is not compatible due to 
backward incompatible changes in Hive 1.2.

Gobblin should also upgrade Hadoop from 2.3 to 2.7.7 at least which is very 
stable. 

  was:Gobblin uses old hive 1.0, compiling against Hive 1.2 is not compatible 
due to backward incompatible changes in Hive 1.2


> upgrade default hadoop versions to 2.7.x and hive version to 1.2
> 
>
> Key: GOBBLIN-818
> URL: https://issues.apache.org/jira/browse/GOBBLIN-818
> Project: Apache Gobblin
>  Issue Type: Improvement
>Reporter: Jay Sen
>Priority: Major
>
> Gobblin uses old hive 1.0, compiling against Hive 1.2 is not compatible due 
> to backward incompatible changes in Hive 1.2.
> Gobblin should also upgrade Hadoop from 2.3 to 2.7.7 at least which is very 
> stable. 



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (GOBBLIN-818) upgrade default hadoop versions to 2.7.x and hive version to 1.2

2019-09-02 Thread Jay Sen (Jira)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Sen updated GOBBLIN-818:

Description: Gobblin uses old hive 1.0, compiling against Hive 1.2 is not 
compatible due to backward incompatible changes in Hive 1.2  (was: Gobblin uses 
old hive 1.0, compiling against Hive 1.2 is not compatible due to backward 
incompatible changes in Hive 1.2

we should also update Hadoop libs from 2.3 to 2.7.7 at least which is very 
stable.)

> upgrade default hadoop versions to 2.7.x and hive version to 1.2
> 
>
> Key: GOBBLIN-818
> URL: https://issues.apache.org/jira/browse/GOBBLIN-818
> Project: Apache Gobblin
>  Issue Type: Improvement
>Reporter: Jay Sen
>Priority: Major
>
> Gobblin uses old hive 1.0, compiling against Hive 1.2 is not compatible due 
> to backward incompatible changes in Hive 1.2



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (GOBBLIN-866) Re-Arrange Gobblin Modules and Classes as per GIP 2

2019-09-02 Thread Jay Sen (Jira)
Jay Sen created GOBBLIN-866:
---

 Summary: Re-Arrange Gobblin Modules and Classes as per GIP 2
 Key: GOBBLIN-866
 URL: https://issues.apache.org/jira/browse/GOBBLIN-866
 Project: Apache Gobblin
  Issue Type: Improvement
Affects Versions: 0.14.0
Reporter: Jay Sen
 Fix For: 0.15.0


Re-Arrange Gobblin Modules and Classes as per GIP 2: 
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=119547565



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (GOBBLIN-830) update config key to define job launcher type

2019-09-02 Thread Jay Sen (Jira)


[ 
https://issues.apache.org/jira/browse/GOBBLIN-830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16921056#comment-16921056
 ] 

Jay Sen commented on GOBBLIN-830:
-

To support both `launcher.type` and `job.launcher.type`
- `JobLauncherType` is nested, need to taken out, since most of the 
`jobLauncher` is defined in `gobblin-runtime`, it needs to be placed under that 
module.
- instead of handling if/else at multiple place, I should add 
`JobLauncherUtils.getJobLauncherType` method, but that would create circular 
dependency between `gobblin-utility` and `gobblin-runtime`.
This is minor change compare to amount of refactoring that is required to 
support both properties which is basically touching the complexity of the 
project that requires proper refactoring ( 
[GIP2](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=119547565)
 )


> update config key to define job launcher type
> -
>
> Key: GOBBLIN-830
> URL: https://issues.apache.org/jira/browse/GOBBLIN-830
> Project: Apache Gobblin
>  Issue Type: Improvement
>Reporter: Jay Sen
>Priority: Minor
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> config key JOB_LAUNCHER_TYPE_KEY = "launcher.type" should be set to 
> "job.launcher.type"



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (GOBBLIN-822) upgrade log4j to log4j2

2019-08-12 Thread Jay Sen (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Sen updated GOBBLIN-822:

Description: 
log4j2 has routing appender that would be super useful and probably only way to 
achieve " job specific log files" functionality without meddling around 
fileHandler in log4j.

Also log4j2 has lot of new functionalities and performance benefits (ref: 
HIVE-11304)

  was:
log4j2 has routing appender that would be super useful and probably only way to 
achieve " job specific log files" functionality without meddling around 
fileHandler in log4j.

Also log4j2 has lot of new functionalities and performance benefits


> upgrade log4j to log4j2
> ---
>
> Key: GOBBLIN-822
> URL: https://issues.apache.org/jira/browse/GOBBLIN-822
> Project: Apache Gobblin
>  Issue Type: Sub-task
>Affects Versions: 0.14.0
>Reporter: Jay Sen
>Priority: Major
> Fix For: 0.15.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> log4j2 has routing appender that would be super useful and probably only way 
> to achieve " job specific log files" functionality without meddling around 
> fileHandler in log4j.
> Also log4j2 has lot of new functionalities and performance benefits (ref: 
> HIVE-11304)



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (GOBBLIN-854) update config reader in standalone mode

2019-08-12 Thread Jay Sen (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Sen updated GOBBLIN-854:

Description: 
standalone mode {{SchedulerDaemon}} uses java properties, this ticket is to use 
TypeSafe Config instead to make config standardized across the modes and also 
enable config to take benefits of TypeSafe functionalities.

Also it takes 2 different config file as argument, one as default and another 
as custom, we probably only need one property file.

  was:standalone mode \{{SchedulerDaemon}} uses java properties, this ticket is 
to use TypeSafe Config instead to make config standardized across the modes and 
also enable config to take benefits of TypeSafe functionalities.


> update config reader in standalone mode
> ---
>
> Key: GOBBLIN-854
> URL: https://issues.apache.org/jira/browse/GOBBLIN-854
> Project: Apache Gobblin
>  Issue Type: Improvement
>Affects Versions: 0.14.0
>Reporter: Jay Sen
>Priority: Major
> Fix For: 0.15.0
>
>
> standalone mode {{SchedulerDaemon}} uses java properties, this ticket is to 
> use TypeSafe Config instead to make config standardized across the modes and 
> also enable config to take benefits of TypeSafe functionalities.
> Also it takes 2 different config file as argument, one as default and another 
> as custom, we probably only need one property file.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (GOBBLIN-854) update config reader in standalone mode

2019-08-12 Thread Jay Sen (JIRA)
Jay Sen created GOBBLIN-854:
---

 Summary: update config reader in standalone mode
 Key: GOBBLIN-854
 URL: https://issues.apache.org/jira/browse/GOBBLIN-854
 Project: Apache Gobblin
  Issue Type: Improvement
Reporter: Jay Sen


standalone mode \{{SchedulerDaemon}} uses java properties, this ticket is to 
use TypeSafe Config instead to make config standardized across the modes and 
also enable config to take benefits of TypeSafe functionalities.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (GOBBLIN-854) update config reader in standalone mode

2019-08-12 Thread Jay Sen (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Sen updated GOBBLIN-854:

Affects Version/s: 0.14.0

> update config reader in standalone mode
> ---
>
> Key: GOBBLIN-854
> URL: https://issues.apache.org/jira/browse/GOBBLIN-854
> Project: Apache Gobblin
>  Issue Type: Improvement
>Affects Versions: 0.14.0
>Reporter: Jay Sen
>Priority: Major
>
> standalone mode \{{SchedulerDaemon}} uses java properties, this ticket is to 
> use TypeSafe Config instead to make config standardized across the modes and 
> also enable config to take benefits of TypeSafe functionalities.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (GOBBLIN-854) update config reader in standalone mode

2019-08-12 Thread Jay Sen (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Sen updated GOBBLIN-854:

Fix Version/s: 0.15.0

> update config reader in standalone mode
> ---
>
> Key: GOBBLIN-854
> URL: https://issues.apache.org/jira/browse/GOBBLIN-854
> Project: Apache Gobblin
>  Issue Type: Improvement
>Affects Versions: 0.14.0
>Reporter: Jay Sen
>Priority: Major
> Fix For: 0.15.0
>
>
> standalone mode \{{SchedulerDaemon}} uses java properties, this ticket is to 
> use TypeSafe Config instead to make config standardized across the modes and 
> also enable config to take benefits of TypeSafe functionalities.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (GOBBLIN-824) upgrade to latest libraries in Gobblin

2019-08-12 Thread Jay Sen (JIRA)


[ 
https://issues.apache.org/jira/browse/GOBBLIN-824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16905621#comment-16905621
 ] 

Jay Sen commented on GOBBLIN-824:
-

GOBBLIN-818 does minor upgrades.

> upgrade to latest libraries in Gobblin
> --
>
> Key: GOBBLIN-824
> URL: https://issues.apache.org/jira/browse/GOBBLIN-824
> Project: Apache Gobblin
>  Issue Type: Improvement
>Affects Versions: 0.14.0
>Reporter: Jay Sen
>Priority: Major
> Fix For: 0.15.0
>
>
> lot of libs are old, like hadoop, hive, etc... 
>  it wont be easy to just comile gobblin with new version via passing new 
> version on command line, there is lot of changes since last couple of years. 
>  Gobblin should use latest versions
> Hadoop: 2.9.x 
>  hive : 2.3.5
>  pegasus: 24.0.2
> Avro : 1.8.2
> etc...
> please feel free to mention which lib should be updated as part of this 
> overall upgrade process.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (GOBBLIN-824) upgrade to latest libraries in Gobblin

2019-08-12 Thread Jay Sen (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Sen updated GOBBLIN-824:

Summary: upgrade to latest libraries in Gobblin  (was: upgrade libs 
versions in Gobblin)

> upgrade to latest libraries in Gobblin
> --
>
> Key: GOBBLIN-824
> URL: https://issues.apache.org/jira/browse/GOBBLIN-824
> Project: Apache Gobblin
>  Issue Type: Improvement
>Affects Versions: 0.14.0
>Reporter: Jay Sen
>Priority: Major
> Fix For: 0.15.0
>
>
> lot of libs are old, like hadoop, hive, etc... 
>  it wont be easy to just comile gobblin with new version via passing new 
> version on command line, there is lot of changes since last couple of years. 
>  Gobblin should use latest versions
> Hadoop: 2.9.x 
>  hive : 2.3.5
>  pegasus: 24.0.2
> Avro : 1.8.2
> etc...
> please feel free to mention which lib should be updated as part of this 
> overall upgrade process.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (GOBBLIN-824) upgrade libs versions in Gobblin

2019-08-12 Thread Jay Sen (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Sen updated GOBBLIN-824:

Description: 
lot of libs are old, like hadoop, hive, etc... 
 it wont be easy to just comile gobblin with new version via passing new 
version on command line, there is lot of changes since last couple of years. 
 Gobblin should use latest versions

Hadoop: 2.9.x 
 hive : 2.3.5
 pegasus: 24.0.2

Avro : 1.8.2

etc...

please feel free to mention which lib should be updated as part of this overall 
upgrade process.

  was:
lot of libs are old, like hadoop, hive, etc... 
it wont be easy to just comile gobblin with new version via passing new version 
on command line, there is lot of changes since last couple of years. 
Gobblin should use latest versions

hadoop : 2.7.7
hive : 2.3.5
pegasus: 24.0.2

etc...

please feel free to mention which lib should be updated as part of this overall 
upgrade process.



> upgrade libs versions in Gobblin
> 
>
> Key: GOBBLIN-824
> URL: https://issues.apache.org/jira/browse/GOBBLIN-824
> Project: Apache Gobblin
>  Issue Type: Improvement
>Affects Versions: 0.14.0
>Reporter: Jay Sen
>Priority: Major
> Fix For: 0.15.0
>
>
> lot of libs are old, like hadoop, hive, etc... 
>  it wont be easy to just comile gobblin with new version via passing new 
> version on command line, there is lot of changes since last couple of years. 
>  Gobblin should use latest versions
> Hadoop: 2.9.x 
>  hive : 2.3.5
>  pegasus: 24.0.2
> Avro : 1.8.2
> etc...
> please feel free to mention which lib should be updated as part of this 
> overall upgrade process.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (GOBBLIN-818) upgrade default hadoop versions to 2.7.x and hive version to 1.2

2019-08-12 Thread Jay Sen (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Sen updated GOBBLIN-818:

Description: 
Gobblin uses old hive 1.0, compiling against Hive 1.2 is not compatible due to 
backward incompatible changes in Hive 1.2

we should also update Hadoop libs from 2.3 to 2.7.7 at least which is very 
stable.

  was:
Gobblin uses old hive 1.x.
Hive 2.x has significant changes and some incompatible/deprecated classes.
while hive 3.x is already in pipeline along with Hadoop 3.x, we should move to 
hive 2.x so user dont have to deal with manual fixes for their use of Gobblin. 

we should also update Hadoop libs from 2.3 to 2.7.7 at least which is very 
stable.



> upgrade default hadoop versions to 2.7.x and hive version to 1.2
> 
>
> Key: GOBBLIN-818
> URL: https://issues.apache.org/jira/browse/GOBBLIN-818
> Project: Apache Gobblin
>  Issue Type: Improvement
>Reporter: Jay Sen
>Priority: Major
>
> Gobblin uses old hive 1.0, compiling against Hive 1.2 is not compatible due 
> to backward incompatible changes in Hive 1.2
> we should also update Hadoop libs from 2.3 to 2.7.7 at least which is very 
> stable.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (GOBBLIN-818) upgrade default hadoop versions to 2.7.x and hive version to 1.2

2019-08-12 Thread Jay Sen (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Sen updated GOBBLIN-818:

Summary: upgrade default hadoop versions to 2.7.x and hive version to 1.2  
(was: MIgrate to Hive 2.x as default)

> upgrade default hadoop versions to 2.7.x and hive version to 1.2
> 
>
> Key: GOBBLIN-818
> URL: https://issues.apache.org/jira/browse/GOBBLIN-818
> Project: Apache Gobblin
>  Issue Type: Improvement
>Reporter: Jay Sen
>Priority: Major
>
> Gobblin uses old hive 1.x.
> Hive 2.x has significant changes and some incompatible/deprecated classes.
> while hive 3.x is already in pipeline along with Hadoop 3.x, we should move 
> to hive 2.x so user dont have to deal with manual fixes for their use of 
> Gobblin. 
> we should also update Hadoop libs from 2.3 to 2.7.7 at least which is very 
> stable.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (GOBBLIN-850) remove duplicated buildFileSystem method

2019-08-11 Thread Jay Sen (JIRA)
Jay Sen created GOBBLIN-850:
---

 Summary: remove duplicated buildFileSystem method
 Key: GOBBLIN-850
 URL: https://issues.apache.org/jira/browse/GOBBLIN-850
 Project: Apache Gobblin
  Issue Type: Improvement
Affects Versions: 0.14.0
Reporter: Jay Sen
 Fix For: 0.15.0


{{FileSystem buildFileSystem(Properties jobProps, Configuration 
configuration)}}  is duplicated in almost all deployment modes, such functions 
should be part of gobblin-core:utils.

 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (GOBBLIN-845) upgrade to latest gradle

2019-08-04 Thread Jay Sen (JIRA)
Jay Sen created GOBBLIN-845:
---

 Summary: upgrade to latest gradle
 Key: GOBBLIN-845
 URL: https://issues.apache.org/jira/browse/GOBBLIN-845
 Project: Apache Gobblin
  Issue Type: Improvement
Reporter: Jay Sen


currently Gobblin uses very old gradle version (2.13), current version is 5.5.1

there are lot of new features added since then. 

specially for project with this complexity will get help from latest features, 
specially around dependency management.

This, although, requires lot of changes in gradle script to refactor due to 
deprecated features.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (GOBBLIN-845) upgrade to latest gradle

2019-08-04 Thread Jay Sen (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Sen updated GOBBLIN-845:

Affects Version/s: 0.14.0

> upgrade to latest gradle
> 
>
> Key: GOBBLIN-845
> URL: https://issues.apache.org/jira/browse/GOBBLIN-845
> Project: Apache Gobblin
>  Issue Type: Improvement
>Affects Versions: 0.14.0
>Reporter: Jay Sen
>Priority: Major
>
> currently Gobblin uses very old gradle version (2.13), current version is 
> 5.5.1
> there are lot of new features added since then. 
> specially for project with this complexity will get help from latest 
> features, specially around dependency management.
> This, although, requires lot of changes in gradle script to refactor due to 
> deprecated features.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (GOBBLIN-845) upgrade to latest gradle

2019-08-04 Thread Jay Sen (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Sen updated GOBBLIN-845:

Fix Version/s: 0.15.0

> upgrade to latest gradle
> 
>
> Key: GOBBLIN-845
> URL: https://issues.apache.org/jira/browse/GOBBLIN-845
> Project: Apache Gobblin
>  Issue Type: Improvement
>Affects Versions: 0.14.0
>Reporter: Jay Sen
>Priority: Major
> Fix For: 0.15.0
>
>
> currently Gobblin uses very old gradle version (2.13), current version is 
> 5.5.1
> there are lot of new features added since then. 
> specially for project with this complexity will get help from latest 
> features, specially around dependency management.
> This, although, requires lot of changes in gradle script to refactor due to 
> deprecated features.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (GOBBLIN-844) centralize all graddle scripts

2019-08-04 Thread Jay Sen (JIRA)
Jay Sen created GOBBLIN-844:
---

 Summary: centralize all graddle scripts
 Key: GOBBLIN-844
 URL: https://issues.apache.org/jira/browse/GOBBLIN-844
 Project: Apache Gobblin
  Issue Type: Improvement
Affects Versions: 0.14.0
Reporter: Jay Sen
 Fix For: 0.15.0


currently flavored and envrionment and some other gradle scripts are scattered, 
all should be under gradle dir scripts/  . all that facilitate the custom 
gradle build for gobblin.

Task here is pretty much move the file and change the path of those gradle 
scripts accordingly.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (GOBBLIN-843) Separately startable Admin UI & REST Server

2019-08-04 Thread Jay Sen (JIRA)
Jay Sen created GOBBLIN-843:
---

 Summary: Separately startable Admin UI & REST Server 
 Key: GOBBLIN-843
 URL: https://issues.apache.org/jira/browse/GOBBLIN-843
 Project: Apache Gobblin
  Issue Type: New Feature
Affects Versions: 0.14.0
Reporter: Jay Sen
 Fix For: 0.15.0


currently, the admin UI & rest server starts and clubbed within the master 
process ( standalone, or cluster master/worker process ).

If we can have ability to start the rest server and admin UI separately, it 
would help manage it better and decouple the deployment modes and admin/rest 
services.

the gobblin service command should facilitate starting stopping admin/rest 
services.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (GOBBLIN-842) remove redundent TestAppender class

2019-08-01 Thread Jay Sen (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Sen updated GOBBLIN-842:

Affects Version/s: 0.14.0
Fix Version/s: 0.15.0

> remove redundent TestAppender class
> ---
>
> Key: GOBBLIN-842
> URL: https://issues.apache.org/jira/browse/GOBBLIN-842
> Project: Apache Gobblin
>  Issue Type: New Feature
>Affects Versions: 0.14.0
>Reporter: Jay Sen
>Priority: Minor
> Fix For: 0.15.0
>
>
> for test case purpose TestAppender classes are created locally, it can be 
> common to remove redundent class 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (GOBBLIN-842) remove redundent TestAppender class

2019-08-01 Thread Jay Sen (JIRA)


[ 
https://issues.apache.org/jira/browse/GOBBLIN-842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16898361#comment-16898361
 ] 

Jay Sen commented on GOBBLIN-842:
-

can find reference and context of this change in GOBBLIN-822 

> remove redundent TestAppender class
> ---
>
> Key: GOBBLIN-842
> URL: https://issues.apache.org/jira/browse/GOBBLIN-842
> Project: Apache Gobblin
>  Issue Type: New Feature
>Affects Versions: 0.14.0
>Reporter: Jay Sen
>Priority: Minor
> Fix For: 0.15.0
>
>
> for test case purpose TestAppender classes are created locally, it can be 
> common to remove redundent class 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (GOBBLIN-842) remove redundent TestAppender class

2019-08-01 Thread Jay Sen (JIRA)
Jay Sen created GOBBLIN-842:
---

 Summary: remove redundent TestAppender class
 Key: GOBBLIN-842
 URL: https://issues.apache.org/jira/browse/GOBBLIN-842
 Project: Apache Gobblin
  Issue Type: New Feature
Reporter: Jay Sen


for test case purpose TestAppender classes are created locally, it can be 
common to remove redundent class 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (GOBBLIN-835) add date transformation converter

2019-07-26 Thread Jay Sen (JIRA)
Jay Sen created GOBBLIN-835:
---

 Summary: add date transformation converter
 Key: GOBBLIN-835
 URL: https://issues.apache.org/jira/browse/GOBBLIN-835
 Project: Apache Gobblin
  Issue Type: New Feature
Affects Versions: 0.14.0
Reporter: Jay Sen
 Fix For: 0.15.0


create new convertor for date format 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (GOBBLIN-832) get specific tables instead of all tables from Hive DB

2019-07-21 Thread Jay Sen (JIRA)
Jay Sen created GOBBLIN-832:
---

 Summary: get specific tables instead of all tables from Hive DB
 Key: GOBBLIN-832
 URL: https://issues.apache.org/jira/browse/GOBBLIN-832
 Project: Apache Gobblin
  Issue Type: Improvement
Affects Versions: 0.14.0
Reporter: Jay Sen
 Fix For: 0.15.0


{{HiveDataSetFinder}} uses (Hive's) {{client.get().getAllDatabases()}} and 
{{client.get().getAllTables(db)}} which can be inefficient when DB have 
thousands of tables, getting all tables wont be efficient on metastore all the 
time.
should use methods like {{listTableNamesByFilter}} or {{getTables}} which has 
ways to specify table pattern as well.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (GOBBLIN-830) update config key to define job launcher type

2019-07-16 Thread Jay Sen (JIRA)
Jay Sen created GOBBLIN-830:
---

 Summary: update config key to define job launcher type
 Key: GOBBLIN-830
 URL: https://issues.apache.org/jira/browse/GOBBLIN-830
 Project: Apache Gobblin
  Issue Type: Improvement
Reporter: Jay Sen


config key JOB_LAUNCHER_TYPE_KEY = "launcher.type" should be set to 
"job.launcher.type"



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (GOBBLIN-824) upgrade libs versions in Gobblin

2019-07-11 Thread Jay Sen (JIRA)
Jay Sen created GOBBLIN-824:
---

 Summary: upgrade libs versions in Gobblin
 Key: GOBBLIN-824
 URL: https://issues.apache.org/jira/browse/GOBBLIN-824
 Project: Apache Gobblin
  Issue Type: Improvement
Affects Versions: 0.14.0
Reporter: Jay Sen
 Fix For: 0.15.0


lot of libs are old, like hadoop, hive, etc... 
it wont be easy to just comile gobblin with new version via passing new version 
on command line, there is lot of changes since last couple of years. 
Gobblin should use latest versions

hadoop : 2.7.7
hive : 2.3.5
pegasus: 24.0.2

etc...

please feel free to mention which lib should be updated as part of this overall 
upgrade process.




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (GOBBLIN-822) upgrade log4j to log4j2

2019-07-11 Thread Jay Sen (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Sen updated GOBBLIN-822:

Summary: upgrade log4j to log4j2  (was: update log4j to log4j2)

> upgrade log4j to log4j2
> ---
>
> Key: GOBBLIN-822
> URL: https://issues.apache.org/jira/browse/GOBBLIN-822
> Project: Apache Gobblin
>  Issue Type: Sub-task
>Affects Versions: 0.14.0
>Reporter: Jay Sen
>Priority: Major
> Fix For: 0.15.0
>
>
> log4j2 has routing appender that would be super useful and probably only way 
> to achieve " job specific log files" functionality without meddling around 
> fileHandler in log4j.
> Also log4j2 has lot of new functionalities and performance benefits



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (GOBBLIN-822) update log4j to log4j2

2019-07-11 Thread Jay Sen (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Sen updated GOBBLIN-822:

Description: 
log4j2 has routing appender that would be super useful and probably only way to 
achieve " job specific log files" functionality without meddling around 
fileHandler in log4j.

Also log4j2 has lot of new functionalities and performance benefits

  was:
log4j2 has routing appender that would be super useful and probably only way to 
achieve this functionality without meddling around fileHandler in log4j.

Also log4j2 has lot of new functionalities and performance benefits


> update log4j to log4j2
> --
>
> Key: GOBBLIN-822
> URL: https://issues.apache.org/jira/browse/GOBBLIN-822
> Project: Apache Gobblin
>  Issue Type: Sub-task
>Affects Versions: 0.14.0
>Reporter: Jay Sen
>Priority: Major
> Fix For: 0.15.0
>
>
> log4j2 has routing appender that would be super useful and probably only way 
> to achieve " job specific log files" functionality without meddling around 
> fileHandler in log4j.
> Also log4j2 has lot of new functionalities and performance benefits



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (GOBBLIN-823) upgrade or remove lombok usage

2019-07-11 Thread Jay Sen (JIRA)
Jay Sen created GOBBLIN-823:
---

 Summary: upgrade or remove lombok usage
 Key: GOBBLIN-823
 URL: https://issues.apache.org/jira/browse/GOBBLIN-823
 Project: Apache Gobblin
  Issue Type: Improvement
Affects Versions: 0.14.0
Reporter: Jay Sen
 Fix For: 0.15.0


Lombok is useful once developer gets used to it, but it has itss own learning 
curve for new user, given lombok adds unnecessary complexity, mot of the things 
can be taken care by smart IDE anyway, I suggest we remove use of lomboke.
if not, lets at least upgrade to latest version to get all bug fixes.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (GOBBLIN-788) job specific log files

2019-07-11 Thread Jay Sen (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Sen updated GOBBLIN-788:

Affects Version/s: 0.14.0

> job specific log files
> --
>
> Key: GOBBLIN-788
> URL: https://issues.apache.org/jira/browse/GOBBLIN-788
> Project: Apache Gobblin
>  Issue Type: New Feature
>Affects Versions: 0.14.0
>Reporter: Jay Sen
>Priority: Major
>
> Each job or task running on individual node should create separate log file 
> (with some standard  file naming convention) which has all the logging 
> specific to that job or task only. The primary platform specific logging can 
> still go to {{logs/.[out|err]}} 
> This would benefit in easy maintenance and operations and debugging.
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (GOBBLIN-788) job specific log files

2019-07-11 Thread Jay Sen (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Sen updated GOBBLIN-788:

Fix Version/s: 0.15.0

> job specific log files
> --
>
> Key: GOBBLIN-788
> URL: https://issues.apache.org/jira/browse/GOBBLIN-788
> Project: Apache Gobblin
>  Issue Type: New Feature
>Affects Versions: 0.14.0
>Reporter: Jay Sen
>Priority: Major
> Fix For: 0.15.0
>
>
> Each job or task running on individual node should create separate log file 
> (with some standard  file naming convention) which has all the logging 
> specific to that job or task only. The primary platform specific logging can 
> still go to {{logs/.[out|err]}} 
> This would benefit in easy maintenance and operations and debugging.
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (GOBBLIN-822) update log4j to log4j2

2019-07-11 Thread Jay Sen (JIRA)
Jay Sen created GOBBLIN-822:
---

 Summary: update log4j to log4j2
 Key: GOBBLIN-822
 URL: https://issues.apache.org/jira/browse/GOBBLIN-822
 Project: Apache Gobblin
  Issue Type: Sub-task
Affects Versions: 0.14.0
Reporter: Jay Sen
 Fix For: 0.15.0


log4j2 has routing appender that would be super useful and probably only way to 
achieve this functionality without meddling around fileHandler in log4j.

Also log4j2 has lot of new functionalities and performance benefits



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (GOBBLIN-818) MIgrate to Hive 2.x as default

2019-07-03 Thread Jay Sen (JIRA)
Jay Sen created GOBBLIN-818:
---

 Summary: MIgrate to Hive 2.x as default
 Key: GOBBLIN-818
 URL: https://issues.apache.org/jira/browse/GOBBLIN-818
 Project: Apache Gobblin
  Issue Type: Improvement
Reporter: Jay Sen


Gobblin uses old hive 1.x.
Hive 2.x has significant changes and some incompatible/deprecated classes.
while hive 3.x is already in pipeline along with Hadoop 3.x, we should move to 
hive 2.x so user dont have to deal with manual fixes for their use of Gobblin. 

we should also update Hadoop libs from 2.3 to 2.7.7 at least which is very 
stable.




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (GOBBLIN-707) combine & standardize all gobblin scripts into one master script & restructure configs accordingly

2019-07-03 Thread Jay Sen (JIRA)


[ 
https://issues.apache.org/jira/browse/GOBBLIN-707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16878203#comment-16878203
 ] 

Jay Sen commented on GOBBLIN-707:
-

[~ibuenros], pls take a look.

> combine & standardize all gobblin scripts into one master script & 
> restructure configs accordingly
> --
>
> Key: GOBBLIN-707
> URL: https://issues.apache.org/jira/browse/GOBBLIN-707
> Project: Apache Gobblin
>  Issue Type: Improvement
>Reporter: Jay Sen
>Priority: Major
>  Time Spent: 9h
>  Remaining Estimate: 0h
>
> gobblin supports multiple modes of executions ( CLI, Standalone, 
> cluster-master, cluster-worker, AWS, YARN, MR ) and various command lines 
> utility to run cli and admin commands. There is a individual script for each 
> of them.
> Having individual script introduces lot of issues
>  # all scripts handles gobblin variables, user parameters differently, and 
> its highly inconsistent among various different gobblin scripts
>  # functionality around start, stop, status checking and handling PID's among 
> lot of other things, varies vastly as per the implementation of the script.
>  # features like GC & JVM params, log4j file selection, classpath 
> calculation, etc... exists in some gobblin scripts but not all, adding to 
> inconsistent user experience.
>  # maintaining total 13 script would be too much effort.
> Also all the gobblin scripts share lot of common code to handle params, 
> start, stop services, status checks, pid handling, etc... combining all the 
> scripts into  1 not only makes maintenance easier but also brings clarity and 
> consistency.
>  
> Solution:
> 1. there can be one gobblin.sh script to handle all gobblin commands and 
> deployment options as per following signature. NOTE: This
> {{gobblin.sh   }}
>  {{gobblin.sh   }}
> {{commands values: admin, cli, statestore-check, statestore-clean, 
> historystore-manager, classpath}}
>  {{service values: standalone, cluster-master, cluster-worker, aws, yarn, mr, 
> service}}
> with above change, following becomes valid command.
> {code:java}
> # all under GobblinCli class
> gobblin run listQuickApps  –> gobblin cli run listQuickApps
> gobblin run listQuickApps  –> gobblin cli run listQuickApps
> gobblin run  -> gobblin cli run 
> # class: JobStateToJsonConverter
> statestore-checker.sh  -> gobblin statestore-checker 
> # class: StateStoreCleaner
> statestore-clean.sh  -> gobblin statestore-clean 
> # class: DatabaseJobHistoryStoreSchemaManager
> historystore-manager.sh  -> gobblin historystore-manager 
> # class: Cli
> gobblin-admin.sh-> gobblin admin 
> # all gobblin deployment modes
> gobblin-cluster-master.sh   -> gobblin cluster-mater start|stop|status
> gobblin-cluster-worker.sh   -> gobblin cluster-mater start|stop|status
> gobblin-compaction.sh   -> gobblin cluster-mater start|stop|status
> gobblin-env.sh  -> gobblin cluster-mater start|stop|status
> gobblin-mapreduce.sh-> gobblin cluster-mater start|stop|status
> gobblin-service.sh  -> gobblin cluster-mater start|stop|status
> gobblin-standalone.sh   -> gobblin cluster-mater start|stop|status
> gobblin-yarn.sh -> gobblin cluster-mater start|stop|status
> {code}
>  
> 2. Also configs needs to be structured and deduped accordingly to make it 
> clear on which config will be picked up for which execution mode.
>  {color:#ff}
>  NOTE: this refactoring adds all cli and service commands to gobblin.sh and 
> hence changes the syntax for all commands and services.{color}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (GOBBLIN-788) job specific log files

2019-06-28 Thread Jay Sen (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Sen updated GOBBLIN-788:

Description: 
Each job or task running on individual node should create separate log file 
(with some standard  file naming convention) which has all the logging specific 
to that job or task only. The primary platform specific logging can still go to 
{{logs/.[out|err]}} 

This would benefit in easy maintenance and operations and debugging.

 

  was:
Each job or task running on individual node should create separate log file 
(with some standard convention) which has all the logging specific to that job 
or task only. The primary platform specific logging can still go to 
{{logs/.[out|err]}} 

This would benefit in lot of use-cases.

 


> job specific log files
> --
>
> Key: GOBBLIN-788
> URL: https://issues.apache.org/jira/browse/GOBBLIN-788
> Project: Apache Gobblin
>  Issue Type: New Feature
>Reporter: Jay Sen
>Priority: Major
>
> Each job or task running on individual node should create separate log file 
> (with some standard  file naming convention) which has all the logging 
> specific to that job or task only. The primary platform specific logging can 
> still go to {{logs/.[out|err]}} 
> This would benefit in easy maintenance and operations and debugging.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (GOBBLIN-811) non gobblin job files should not throw exception

2019-06-28 Thread Jay Sen (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Sen updated GOBBLIN-811:

Description: 
any other files, other than .pull or .conf under job dir should be ignored with 
DEBUG or WARN message instead of throwing exception.

for example: 

{{2019-06-22 21:21:05 PDT ERROR [newDaemonThreadFactory] 
org.apache.gobblin.util.filesystem.ExceptionCatchingPathAlterationListenerDecorator
 - onFileCreate failure:
java.lang.RuntimeException: Cannot load pull file 
file:/tools/gobblin-dist/gobblin-cluster-data/jobs/test_job.pull_old due to 
unrecognized extension.
 at 
org.apache.gobblin.runtime.job_catalog.FSPathAlterationListenerAdaptor.onFileCreate(FSPathAlterationListenerAdaptor.java:61)
 at 
org.apache.gobblin.util.filesystem.ExceptionCatchingPathAlterationListenerDecorator.onFileCreate(ExceptionCatchingPathAlterationListenerDecorator.java:55)
 at 
org.apache.gobblin.util.filesystem.PathAlterationObserver.doCreate(PathAlterationObserver.java:276)
 at 
org.apache.gobblin.util.filesystem.PathAlterationObserver.checkAndNotify(PathAlterationObserver.java:210)
 at 
org.apache.gobblin.util.filesystem.PathAlterationObserver.checkAndNotify(PathAlterationObserver.java:180)
 at 
org.apache.gobblin.util.filesystem.PathAlterationObserverScheduler.run(PathAlterationObserverScheduler.java:163)
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
 at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
 at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
 at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
 at java.lang.Thread.run(Thread.java:748)}}

  was:
any other files, other than .pull or .conf under job dir should be ignored with 
DEBUG or WARN message instead of throwing exception.

for example: 
{{2019-06-22 21:21:05 PDT ERROR [newDaemonThreadFactory] 
org.apache.gobblin.util.filesystem.ExceptionCatchingPathAlterationListenerDecorator
 - onFileCreate failure:
java.lang.RuntimeException: Cannot load pull file 
file:/tools/gobblin-dist/gobblin-cluster-data/jobs/test_job.pull_old due to 
unrecognized extension.
 at 
org.apache.gobblin.runtime.job_catalog.FSPathAlterationListenerAdaptor.onFileCreate(FSPathAlterationListenerAdaptor.java:61)
 at 
org.apache.gobblin.util.filesystem.ExceptionCatchingPathAlterationListenerDecorator.onFileCreate(ExceptionCatchingPathAlterationListenerDecorator.java:55)
 at 
org.apache.gobblin.util.filesystem.PathAlterationObserver.doCreate(PathAlterationObserver.java:276)
 at 
org.apache.gobblin.util.filesystem.PathAlterationObserver.checkAndNotify(PathAlterationObserver.java:210)
 at 
org.apache.gobblin.util.filesystem.PathAlterationObserver.checkAndNotify(PathAlterationObserver.java:180)
 at 
org.apache.gobblin.util.filesystem.PathAlterationObserverScheduler.run(PathAlterationObserverScheduler.java:163)
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
 at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
 at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
 at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
 at java.lang.Thread.run(Thread.java:748)}}


> non gobblin job files should not throw exception
> 
>
> Key: GOBBLIN-811
> URL: https://issues.apache.org/jira/browse/GOBBLIN-811
> Project: Apache Gobblin
>  Issue Type: Improvement
>Reporter: Jay Sen
>Priority: Minor
>
> any other files, other than .pull or .conf under job dir should be ignored 
> with DEBUG or WARN message instead of throwing exception.
> for example: 
> {{2019-06-22 21:21:05 PDT ERROR [newDaemonThreadFactory] 
> org.apache.gobblin.util.filesystem.ExceptionCatchingPathAlterationListenerDecorator
>  - onFileCreate failure:
> java.lang.RuntimeException: Cannot load pull file 
> file:/tools/gobblin-dist/gobblin-cluster-data/jobs/test_job.pull_old due to 
> unrecognized extension.
>  at 
> org.apache.gobblin.runtime.job_catalog.FSPathAlterationListenerAdaptor.onFileCreate(FSPathAlterationListenerAdaptor.java:61)
>  at 
> org.apache.gobblin.util.filesystem.ExceptionCatchingPathAlterationListenerDecorator.onFileCreate(ExceptionCatchingPathAlterationListenerDecorator.java:55)
>  at 
> 

[jira] [Commented] (GOBBLIN-613) Use the Hadoop tokens provided by WormholePushJob, instead of negotiating them via Gobblin

2019-06-28 Thread Jay Sen (JIRA)


[ 
https://issues.apache.org/jira/browse/GOBBLIN-613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16875162#comment-16875162
 ] 

Jay Sen commented on GOBBLIN-613:
-

[~rakeshmalladi2018], can u provide some more detail here ? Thanks

> Use the Hadoop tokens provided by WormholePushJob, instead of negotiating 
> them via Gobblin
> --
>
> Key: GOBBLIN-613
> URL: https://issues.apache.org/jira/browse/GOBBLIN-613
> Project: Apache Gobblin
>  Issue Type: New Feature
>Reporter: Rakesh Malladi
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (GOBBLIN-812) take worker id from command line if specified

2019-06-24 Thread Jay Sen (JIRA)
Jay Sen created GOBBLIN-812:
---

 Summary: take worker id from command line if specified
 Key: GOBBLIN-812
 URL: https://issues.apache.org/jira/browse/GOBBLIN-812
 Project: Apache Gobblin
  Issue Type: New Feature
  Components: gobblin-cluster
Affects Versions: 0.14.0
Reporter: Jay Sen
Assignee: Hung Tran
 Fix For: 0.15.0


{{GobblinTaskRunner}} can take and use worker id from command line if specified 
while starting the worker.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


  1   2   >