Re: Review Request 59394: Race condition: webhdfs call mkdir /tmp/druid-indexing before /tmp making tmp not writable.

2017-06-06 Thread Dmytro Grinenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/59394/#review177018
---


Ship it!




Ship It!

- Dmytro Grinenko


On May 19, 2017, 9:54 a.m., Andrew Onischuk wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/59394/
> ---
> 
> (Updated May 19, 2017, 9:54 a.m.)
> 
> 
> Review request for Ambari and Vitalyi Brodetskyi.
> 
> 
> Bugs: AMBARI-21070
> https://issues.apache.org/jira/browse/AMBARI-21070
> 
> 
> Repository: ambari
> 
> 
> Description
> ---
> 
> Race condition: webhdfs call mkdir /tmp/druid-indexing before /tmp making tmp
> not writable.
> 
> @HDP install through ambari , just at the step start components on host< > we
> have some webhdfs operations in background which is creating HDFS directory
> structures required for specific components like (/tmp, /tmp/hive /user/druid
> /tmp/druid-indexing ...)
> 
> generally the expected order is getfileInfo : /tmp --> mkdir: /tmp
> changePermission: /tmp to 777 (hdfs:hdfs) so that /tmp is accessible to all ,
> hence hivemetastore able to create /tmp/hive(hive scratch directory)
> 
> But here in this case specific to druid install , most of the times mkdir of
> /tmp/druid-indexing called before(actual /tmp creation) and thus /tmp is
> having just default directory permission(755).
> 
> ->So next call of getfileInfo : /tmp says already exist it will not further 
> create and change permission
> 
> This made /tmp not accessible to write, So HiveServer process gets shutdown as
> it unable to create/access /tmp/hive.
> 
> hdfs-audit log:
> 
> 
> 
> 
> 2017-05-12 06:39:51,067 INFO FSNamesystem.audit: allowed=true   ugi=hdfs 
> (auth:SIMPLE)  ip=/172.27.26.3 cmd=getfileinfo src=/tmp/druid-indexing 
> dst=nullperm=null   proto=webhdfs
> 2017-05-12 06:39:51,120 INFO FSNamesystem.audit: allowed=true   ugi=hdfs 
> (auth:SIMPLE)  ip=/172.27.22.81cmd=contentSummary  
> src=/user/druid dst=nullperm=null   proto=webhdfs
> 2017-05-12 06:39:51,133 INFO FSNamesystem.audit: allowed=true   ugi=hdfs 
> (auth:SIMPLE)  ip=/172.27.37.200   cmd=setPermission   
> src=/ats/active dst=nullperm=hdfs:hadoop:rwxr-xr-x  proto=webhdfs
> 2017-05-12 06:39:51,155 INFO FSNamesystem.audit: allowed=true   ugi=hdfs 
> (auth:SIMPLE)  ip=/172.27.26.3 cmd=mkdirs  src=/tmp/druid-indexing 
> dst=nullperm=hdfs:hdfs:rwxr-xr-xproto=webhdfs
> 2017-05-12 06:39:51,206 INFO FSNamesystem.audit: allowed=true   ugi=hdfs 
> (auth:SIMPLE)  ip=/172.27.22.81cmd=listStatus  src=/user/druid 
> dst=nullperm=null   proto=webhdfs
> 2017-05-12 06:39:51,235 INFO FSNamesystem.audit: allowed=true   ugi=hdfs 
> (auth:SIMPLE)  ip=/172.27.37.200   cmd=setPermission   src=/ats/  
>  dst=nullperm=yarn:hadoop:rwxr-xr-x  proto=webhdfs
> 2017-05-12 06:39:51,249 INFO FSNamesystem.audit: allowed=true   ugi=hdfs 
> (auth:SIMPLE)  ip=/172.27.26.3 cmd=setPermission   
> src=/tmp/druid-indexing dst=nullperm=hdfs:hdfs:rwxr-xr-x
> proto=webhdfs
> 2017-05-12 06:39:51,290 INFO FSNamesystem.audit: allowed=true   ugi=hdfs 
> (auth:SIMPLE)  ip=/172.27.22.81cmd=listStatus  src=/user/druid/data   
>  dst=nullperm=null   proto=webhdfs
> 2017-05-12 06:39:51,339 INFO FSNamesystem.audit: allowed=true   ugi=hdfs 
> (auth:SIMPLE)  ip=/172.27.37.200   cmd=setPermission   
> src=/ats/active/dst=nullperm=hdfs:hadoop:rwxr-xr-x  
> proto=webhdfs
> 2017-05-12 06:39:51,341 INFO FSNamesystem.audit: allowed=true   ugi=hdfs 
> (auth:SIMPLE)  ip=/172.27.26.3 cmd=setOwnersrc=/tmp/druid-indexing 
> dst=nullperm=druid:hdfs:rwxr-xr-x   proto=webhdfs
> 2017-05-12 06:39:51,380 INFO FSNamesystem.audit: allowed=true   ugi=hdfs 
> (auth:SIMPLE)  ip=/172.27.22.81cmd=setOwnersrc=/user/druid/data   
>  dst=nullperm=druid:hdfs:rwxr-xr-x   proto=webhdfs
> 2017-05-12 06:39:51,431 INFO FSNamesystem.audit: allowed=true   ugi=hdfs 
> (auth:SIMPLE)  ip=/172.27.37.200   cmd=setOwnersrc=/ats/active 
> dst=nullperm=yarn:hadoop:rwxr-xr-x  proto=webhdfs
> 2017-05-12 06:39:51,526 INFO FSNamesystem.audit: allowed=true   ugi=hdfs 
> (auth:SIMPLE)  ip=/172.27.37.200   cmd=setOwnersrc=/ats/   
> dst=nullperm=yarn:hadoop:rwxr-xr-x  proto=webhdfs
> 2017-05-12 06:39:51,580 INFO FSNamesystem.audit: allowed=true   ugi=hdfs 
> (auth:SIMPLE)  ip=/172.27.32.12cmd=getfileinfo 
> src=/apps/hbase/staging dst=nullperm=null   proto=webhdfs
> 2017-05-12 06:39:51,620 INFO FSNamesystem.audit: allowed=true   ugi=hdfs 

Review Request 59394: Race condition: webhdfs call mkdir /tmp/druid-indexing before /tmp making tmp not writable.

2017-05-19 Thread Andrew Onischuk

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/59394/
---

Review request for Ambari and Vitalyi Brodetskyi.


Bugs: AMBARI-21070
https://issues.apache.org/jira/browse/AMBARI-21070


Repository: ambari


Description
---

Race condition: webhdfs call mkdir /tmp/druid-indexing before /tmp making tmp
not writable.

@HDP install through ambari , just at the step start components on host< > we
have some webhdfs operations in background which is creating HDFS directory
structures required for specific components like (/tmp, /tmp/hive /user/druid
/tmp/druid-indexing ...)

generally the expected order is getfileInfo : /tmp --> mkdir: /tmp
changePermission: /tmp to 777 (hdfs:hdfs) so that /tmp is accessible to all ,
hence hivemetastore able to create /tmp/hive(hive scratch directory)

But here in this case specific to druid install , most of the times mkdir of
/tmp/druid-indexing called before(actual /tmp creation) and thus /tmp is
having just default directory permission(755).

->So next call of getfileInfo : /tmp says already exist it will not further 
create and change permission

This made /tmp not accessible to write, So HiveServer process gets shutdown as
it unable to create/access /tmp/hive.

hdfs-audit log:




2017-05-12 06:39:51,067 INFO FSNamesystem.audit: allowed=true   ugi=hdfs 
(auth:SIMPLE)  ip=/172.27.26.3 cmd=getfileinfo src=/tmp/druid-indexing dst=null 
   perm=null   proto=webhdfs
2017-05-12 06:39:51,120 INFO FSNamesystem.audit: allowed=true   ugi=hdfs 
(auth:SIMPLE)  ip=/172.27.22.81cmd=contentSummary  src=/user/druid 
dst=nullperm=null   proto=webhdfs
2017-05-12 06:39:51,133 INFO FSNamesystem.audit: allowed=true   ugi=hdfs 
(auth:SIMPLE)  ip=/172.27.37.200   cmd=setPermission   src=/ats/active 
dst=nullperm=hdfs:hadoop:rwxr-xr-x  proto=webhdfs
2017-05-12 06:39:51,155 INFO FSNamesystem.audit: allowed=true   ugi=hdfs 
(auth:SIMPLE)  ip=/172.27.26.3 cmd=mkdirs  src=/tmp/druid-indexing dst=null 
   perm=hdfs:hdfs:rwxr-xr-xproto=webhdfs
2017-05-12 06:39:51,206 INFO FSNamesystem.audit: allowed=true   ugi=hdfs 
(auth:SIMPLE)  ip=/172.27.22.81cmd=listStatus  src=/user/druid dst=null 
   perm=null   proto=webhdfs
2017-05-12 06:39:51,235 INFO FSNamesystem.audit: allowed=true   ugi=hdfs 
(auth:SIMPLE)  ip=/172.27.37.200   cmd=setPermission   src=/ats/   
dst=nullperm=yarn:hadoop:rwxr-xr-x  proto=webhdfs
2017-05-12 06:39:51,249 INFO FSNamesystem.audit: allowed=true   ugi=hdfs 
(auth:SIMPLE)  ip=/172.27.26.3 cmd=setPermission   src=/tmp/druid-indexing 
dst=nullperm=hdfs:hdfs:rwxr-xr-xproto=webhdfs
2017-05-12 06:39:51,290 INFO FSNamesystem.audit: allowed=true   ugi=hdfs 
(auth:SIMPLE)  ip=/172.27.22.81cmd=listStatus  src=/user/druid/data
dst=nullperm=null   proto=webhdfs
2017-05-12 06:39:51,339 INFO FSNamesystem.audit: allowed=true   ugi=hdfs 
(auth:SIMPLE)  ip=/172.27.37.200   cmd=setPermission   src=/ats/active/ 
   dst=nullperm=hdfs:hadoop:rwxr-xr-x  proto=webhdfs
2017-05-12 06:39:51,341 INFO FSNamesystem.audit: allowed=true   ugi=hdfs 
(auth:SIMPLE)  ip=/172.27.26.3 cmd=setOwnersrc=/tmp/druid-indexing dst=null 
   perm=druid:hdfs:rwxr-xr-x   proto=webhdfs
2017-05-12 06:39:51,380 INFO FSNamesystem.audit: allowed=true   ugi=hdfs 
(auth:SIMPLE)  ip=/172.27.22.81cmd=setOwnersrc=/user/druid/data
dst=nullperm=druid:hdfs:rwxr-xr-x   proto=webhdfs
2017-05-12 06:39:51,431 INFO FSNamesystem.audit: allowed=true   ugi=hdfs 
(auth:SIMPLE)  ip=/172.27.37.200   cmd=setOwnersrc=/ats/active dst=null 
   perm=yarn:hadoop:rwxr-xr-x  proto=webhdfs
2017-05-12 06:39:51,526 INFO FSNamesystem.audit: allowed=true   ugi=hdfs 
(auth:SIMPLE)  ip=/172.27.37.200   cmd=setOwnersrc=/ats/   dst=null 
   perm=yarn:hadoop:rwxr-xr-x  proto=webhdfs
2017-05-12 06:39:51,580 INFO FSNamesystem.audit: allowed=true   ugi=hdfs 
(auth:SIMPLE)  ip=/172.27.32.12cmd=getfileinfo src=/apps/hbase/staging 
dst=nullperm=null   proto=webhdfs
2017-05-12 06:39:51,620 INFO FSNamesystem.audit: allowed=true   ugi=hdfs 
(auth:SIMPLE)  ip=/172.27.37.200   cmd=setOwnersrc=/ats/active/
dst=nullperm=yarn:hadoop:rwxr-xr-x  proto=webhdfs


2017-05-12 06:39:53,289 INFO FSNamesystem.audit: allowed=true   ugi=hdfs 
(auth:SIMPLE)  ip=/172.27.26.202   cmd=getfileinfo src=/tmpdst=null 
   perm=null   proto=webhdfs


We can see in the log accessing /tmp/druid-indexing at 06:39:51(hence
/tmp/have just 755 permission as per call), and accessing /tmp(getfileinfo) at
06:39:53, which returns /tmp already