[ https://issues.apache.org/jira/browse/YARN-975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13805907#comment-13805907 ]
Vinod Kumar Vavilapalli commented on YARN-975: ---------------------------------------------- Had a look at the patch, some comments: - HDFS jar is not needed as a test-dependency in hadoop-yarn-server-applicationhistoryservice/pom.xml - You should make HistoryFileReader and HistoryFileWriter as static private inner classes and avoid sharing state altogether. - Wrap new ApplicationStartDataPBImpl(ApplicationStartDataProto.parseFrom(entry.value))) into a method in ApplicationStartData. Similarly others. - getApplicationAttempts(): If there is no history-file, we should throw a valid exception? - finishDtata: Typo - There is no limit on outstandingWriters. If RM runs 1K applications in parallel, we'll have 1K writers - RM can thus potentially go out of file handles. We need to limit this (configurable?) and queue any more writes into a limited number of threads. Can do in a follow up JIRA, please file one. - appId + START_DATA_SUFFIX: Instead of strings and appending, you can write a complex key which has an ApplicationId and the start marker and convert them to bytes when storing via a getBytes() method. - Similarly for ApplicationAttempt and Container suffixes. - When a HistoryFile exists, HistoryFileWriter should open it in append mode. - In both the reader and the writer, you should use IOUtils.cleanup() instead of explicitly calling close on each stream yourselves everywhere. - Don't think we should do this. Any retries should be inside FileSystemHistoryStore. We should close the writer in a finally block. {code} + // Not put close() in finally block in case callers want to retry writing+ // the data. On the other hand, the file will anyway be close when the + // store is stopped. {code} - Dismantle retriveStartFinishData() into two methods - one for start and one for finish. - TestApplicationHistoryStore was renamed in YARN-956, please update the patch - Test: A single file will only have data about a single application. So testWriteHistoryData() should not have multiple applications. Similarly ApplicationAttempt finish to follow after container-finish. - Test: We should NOT have this dependency. Java 7 reorders tests in some cases. {code} + // The order of the test cases matters {code} > Add a file-system implementation for history-storage > ---------------------------------------------------- > > Key: YARN-975 > URL: https://issues.apache.org/jira/browse/YARN-975 > Project: Hadoop YARN > Issue Type: Sub-task > Reporter: Zhijie Shen > Assignee: Zhijie Shen > Attachments: YARN-975.10.patch, YARN-975.1.patch, YARN-975.2.patch, > YARN-975.3.patch, YARN-975.4.patch, YARN-975.5.patch, YARN-975.6.patch, > YARN-975.7.patch, YARN-975.8.patch, YARN-975.9.patch > > > HDFS implementation should be a standard persistence strategy of history > storage -- This message was sent by Atlassian JIRA (v6.1#6144)