-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31878/#review75924
-----------------------------------------------------------

Ship it!


I agree that we can let NameNode do the locking here. I don't care if both 
agents do the same work and the last one in wins.


ambari-common/src/main/python/resource_management/libraries/functions/dynamic_variable_interpretation.py
<https://reviews.apache.org/r/31878/#comment123252>

    That's a lot of code to do something as simple as 
    
    ```
    unique_string = str(uuid.uuid4())[:8]
    ```
    
    I know we don't need UUID power here, but it's concise and makes the code 
cleaner.


- Jonathan Hurley


On March 9, 2015, 9:41 p.m., Alejandro Fernandez wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/31878/
> -----------------------------------------------------------
> 
> (Updated March 9, 2015, 9:41 p.m.)
> 
> 
> Review request for Ambari, Andrew Onischuk, Jonathan Hurley, Nate Cole, and 
> Sid Wagle.
> 
> 
> Bugs: AMBARI-9990
>     https://issues.apache.org/jira/browse/AMBARI-9990
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> Pig Service Check and Hive Server 2 START ran on 2 different machines during 
> the stack installation and failed to copy the tez tarball to HDFS.
> 
> I was able to reproduce this locally by calling CopyFromLocal from two 
> clients simultaneously. See the HDFS audit log, datanode logs on c6408 & 
> c6410, and namenode log on c6410.
> 
> The copyFromLocal command's behavior is:
> * Try to create a temporary file <filename>._COPYING_ and write the real data 
> there
> * If hit any exception, delete the file with the name <filename>._COPYING_
> 
> Thus we have the following race condition in this test:
> Process P1 created file "tez.tar.gz._COPYING_" and wrote data to it
> Process P2 fired the same copyFromLocal command and hit exception because it 
> could not get the lease
> P2 then deleted the file "tez.tar.gz._COPYING_"
> P1 could not close the file "tez.tar.gz._COPYING_" since it had been deleted 
> by P2. The exception would say "could not find lease for file..."
> In general we do not have the correct synchronization guarantee for the 
> "copyFromLocal" command.
> 
> One solution is for the destination file name to be unique. Because the mv 
> command is synchronized by the namenode, at least one of them will succeed in 
> naming the file.
> 
> 
> Diffs
> -----
> 
>   
> ambari-common/src/main/python/resource_management/libraries/functions/dynamic_variable_interpretation.py
>  00b8d70 
> 
> Diff: https://reviews.apache.org/r/31878/diff/
> 
> 
> Testing
> -------
> 
> Unit tests on builds.apache.org passed,
> https://builds.apache.org/job/Ambari-trunk-test-patch/1977/
> 
> I also deployed a cluster and verified that it was able to copy the tarballs 
> to HDFS when installing YARN, Hive, Pig.
> 
> [root@c6408 ~]# su - hdfs -c 'hadoop fs -ls -R /hdp/apps/2.2.2.0-2538/'
> dr-xr-xr-x   - hdfs hdfs          0 2015-03-10 00:55 
> /hdp/apps/2.2.2.0-2538/hive
> -r--r--r--   3 hdfs hadoop   82982575 2015-03-10 00:55 
> /hdp/apps/2.2.2.0-2538/hive/hive.tar.gz
> dr-xr-xr-x   - hdfs hdfs            0 2015-03-10 00:57 
> /hdp/apps/2.2.2.0-2538/mapreduce
> -r--r--r--   3 hdfs hadoop     105000 2015-03-10 00:57 
> /hdp/apps/2.2.2.0-2538/mapreduce/hadoop-streaming.jar
> -r--r--r--   3 hdfs hadoop  192699956 2015-03-09 18:15 
> /hdp/apps/2.2.2.0-2538/mapreduce/mapreduce.tar.gz
> dr-xr-xr-x   - hdfs hdfs            0 2015-03-10 00:56 
> /hdp/apps/2.2.2.0-2538/pig
> -r--r--r--   3 hdfs hadoop   97542246 2015-03-10 00:56 
> /hdp/apps/2.2.2.0-2538/pig/pig.tar.gz
> dr-xr-xr-x   - hdfs hdfs            0 2015-03-09 18:15 
> /hdp/apps/2.2.2.0-2538/tez
> -r--r--r--   3 hdfs hadoop   40656789 2015-03-09 18:15 
> /hdp/apps/2.2.2.0-2538/tez/tez.tar.gz
> 
> 
> Thanks,
> 
> Alejandro Fernandez
> 
>

Reply via email to