Zhixiong Chen created GOBBLIN-716:
-------------------------------------

             Summary: Add FileBasedSource lineage event
                 Key: GOBBLIN-716
                 URL: https://issues.apache.org/jira/browse/GOBBLIN-716
             Project: Apache Gobblin
          Issue Type: Bug
            Reporter: Zhixiong Chen
            Assignee: Zhixiong Chen


It'd be useful to support configuration properties to override the default 
username when connecting to a HDFS cluster, e.g. in the HDFS writers.  The 
system username that owns the Gobblin process is used by default.

One particular use case for this is for stand-alone Gobblin instances running 
as the `root` system user within a Docker container.  Individual users within 
an organization employing a stand-alone Gobblin cluster for data integration 
needs across multiple teams may have multiple users submitting jobs meant to 
touch different parts of the HDFS namespace under the control of separate users.

Note that this feature is not quite security-relevant, as this would still 
allow any job configuration file to specify any username, so there aren't any 
enforced privilege boundaries anyway.

One solution that does not appear to work is to specify the `hadoop.job.ugi` 
property in a job configuration file, despite what this appears to suggest in 
[FsDataWriter.java](https://github.com/linkedin/gobblin/blob/7141ec88c255c8c3cbc7054fb8146eebe77fc09d/gobblin-core/src/main/java/gobblin/writer/FsDataWriter.java#L88-L91):

```java
    Configuration conf = new Configuration();
    // Add all job configuration properties so they are picked up by Hadoop
    JobConfigurationUtils.putStateIntoConfiguration(properties, conf);
    this.fs = WriterUtils.getWriterFS(properties, this.numBranches, 
this.branchId);
```
 
*Github Url* : https://github.com/linkedin/gobblin/issues/1904 
*Github Reporter* : *mgomezch* 
*Github Created At* : 2017-05-26T18:58:16Z 
*Github Updated At* : 2017-05-26T18:58:16Z



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to