[ 
https://issues.apache.org/jira/browse/GOBBLIN-716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhixiong Chen updated GOBBLIN-716:
----------------------------------
    External issue URL:   (was: https://github.com/linkedin/gobblin/issues/1904)

> Add FileBasedSource lineage event
> ---------------------------------
>
>                 Key: GOBBLIN-716
>                 URL: https://issues.apache.org/jira/browse/GOBBLIN-716
>             Project: Apache Gobblin
>          Issue Type: Bug
>            Reporter: Zhixiong Chen
>            Assignee: Zhixiong Chen
>            Priority: Major
>
> It'd be useful to support configuration properties to override the default 
> username when connecting to a HDFS cluster, e.g. in the HDFS writers.  The 
> system username that owns the Gobblin process is used by default.
> One particular use case for this is for stand-alone Gobblin instances running 
> as the `root` system user within a Docker container.  Individual users within 
> an organization employing a stand-alone Gobblin cluster for data integration 
> needs across multiple teams may have multiple users submitting jobs meant to 
> touch different parts of the HDFS namespace under the control of separate 
> users.
> Note that this feature is not quite security-relevant, as this would still 
> allow any job configuration file to specify any username, so there aren't any 
> enforced privilege boundaries anyway.
> One solution that does not appear to work is to specify the `hadoop.job.ugi` 
> property in a job configuration file, despite what this appears to suggest in 
> [FsDataWriter.java](https://github.com/linkedin/gobblin/blob/7141ec88c255c8c3cbc7054fb8146eebe77fc09d/gobblin-core/src/main/java/gobblin/writer/FsDataWriter.java#L88-L91):
> ```java
>     Configuration conf = new Configuration();
>     // Add all job configuration properties so they are picked up by Hadoop
>     JobConfigurationUtils.putStateIntoConfiguration(properties, conf);
>     this.fs = WriterUtils.getWriterFS(properties, this.numBranches, 
> this.branchId);
> ```
>  
> *Github Url* : https://github.com/linkedin/gobblin/issues/1904 
> *Github Reporter* : *mgomezch* 
> *Github Created At* : 2017-05-26T18:58:16Z 
> *Github Updated At* : 2017-05-26T18:58:16Z



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to