[jira] [Commented] (YARN-1185) FileSystemRMStateStore doesn't use temporary files when writing data
[ https://issues.apache.org/jira/browse/YARN-1185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13765649#comment-13765649 ] Bikas Saha commented on YARN-1185: -- Yes. Since FileSystem interface does not provide any atomic operations. The RM will not start if there is anything wrong with the stored state. So it some write is partial/empty is will not start. At that point we can judge if the missing piece is important or not and purge that piece and continue. This should be ok for job related data since we only lose a job. However, for global data like secret keys we may have to be more careful. In one case we encode the info in the file name. In other cases, where the data cannot be encoded in the file name, we may have to ensure that the store operation is not partial/empty. For HDFS we may assume atomic rename but will that be true for all filesystems? So we could do the following. Storing app data may continue to be optimistic and since thats the main workload we continue to do what we do today. Storing global data (mainly the security stuff) can change to be more atomic. We can make all store operations more atomic if we feel that we will not slow down the RM because of multiple roundtrips to the store. FileSystemRMStateStore doesn't use temporary files when writing data Key: YARN-1185 URL: https://issues.apache.org/jira/browse/YARN-1185 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.1.0-beta Reporter: Jason Lowe FileSystemRMStateStore writes directly to the destination file when storing state. However if the RM were to crash in the middle of the write, the recovery method could encounter a partially-written file and either outright crash during recovery or silently load incomplete state. To avoid this, the data should be written to a temporary file and renamed to the destination file afterwards. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1185) FileSystemRMStateStore doesn't use temporary files when writing data
[ https://issues.apache.org/jira/browse/YARN-1185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13765622#comment-13765622 ] Jason Lowe commented on YARN-1185: -- Also, couldn't it be left with zero-length files if it dies after create but before write can occur? FileSystemRMStateStore doesn't use temporary files when writing data Key: YARN-1185 URL: https://issues.apache.org/jira/browse/YARN-1185 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.1.0-beta Reporter: Jason Lowe FileSystemRMStateStore writes directly to the destination file when storing state. However if the RM were to crash in the middle of the write, the recovery method could encounter a partially-written file and either outright crash during recovery or silently load incomplete state. To avoid this, the data should be written to a temporary file and renamed to the destination file afterwards. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1185) FileSystemRMStateStore doesn't use temporary files when writing data
[ https://issues.apache.org/jira/browse/YARN-1185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13765620#comment-13765620 ] Jason Lowe commented on YARN-1185: -- Ah I see. That's an assumption based on a specific FileSystem implementation -- does it only work with HDFS? FileSystemRMStateStore doesn't use temporary files when writing data Key: YARN-1185 URL: https://issues.apache.org/jira/browse/YARN-1185 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.1.0-beta Reporter: Jason Lowe FileSystemRMStateStore writes directly to the destination file when storing state. However if the RM were to crash in the middle of the write, the recovery method could encounter a partially-written file and either outright crash during recovery or silently load incomplete state. To avoid this, the data should be written to a temporary file and renamed to the destination file afterwards. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1185) FileSystemRMStateStore doesn't use temporary files when writing data
[ https://issues.apache.org/jira/browse/YARN-1185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13765607#comment-13765607 ] Bikas Saha commented on YARN-1185: -- The expectation was that the data being written to each file is so small that it gets sent over completely in the first transfer. So partial writes would not occur. Thats why the code accumulates the write buffer locally and then issues a single write to the FileSystem. FileSystemRMStateStore doesn't use temporary files when writing data Key: YARN-1185 URL: https://issues.apache.org/jira/browse/YARN-1185 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.1.0-beta Reporter: Jason Lowe FileSystemRMStateStore writes directly to the destination file when storing state. However if the RM were to crash in the middle of the write, the recovery method could encounter a partially-written file and either outright crash during recovery or silently load incomplete state. To avoid this, the data should be written to a temporary file and renamed to the destination file afterwards. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira