[ https://issues.apache.org/jira/browse/YARN-6357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15936727#comment-15936727 ]
Vrushali C commented on YARN-6357: ---------------------------------- In the write APIs, I think it will be good to have an explicitly named async API and a sync API (rather than persistXX). I think the API doc should explain that the sync one means syncing to call HBase flush and that aysnc means it will be translated to hbase#flush after a time interval or at the end of the application life. > Implement TimelineCollector#putEntitiesAsync > -------------------------------------------- > > Key: YARN-6357 > URL: https://issues.apache.org/jira/browse/YARN-6357 > Project: Hadoop YARN > Issue Type: Sub-task > Components: ATSv2, timelineserver > Affects Versions: YARN-2928 > Reporter: Joep Rottinghuis > Assignee: Haibo Chen > Labels: yarn-5355-merge-blocker > Attachments: YARN-6357.01.patch, YARN-6357.02.patch > > > As discovered and discussed in YARN-5269 the > TimelineCollector#putEntitiesAsync method is currently not implemented and > TimelineCollector#putEntities is asynchronous. > TimelineV2ClientImpl#putEntities vs TimelineV2ClientImpl#putEntitiesAsync > correctly call TimelineEntityDispatcher#dispatchEntities(boolean sync,... > with the correct argument. This argument does seem to make it into the > params, and on the server side TimelineCollectorWebService#putEntities > correctly pulls the async parameter from the rest call. See line 156: > {code} > boolean isAsync = async != null && async.trim().equalsIgnoreCase("true"); > {code} > However, this is where the problem starts. It simply calls > TimelineCollector#putEntities and ignores the value of isAsync. It should > instead have called TimelineCollector#putEntitiesAsync, which is currently > not implemented. > putEntities should call putEntitiesAsync and then after that call > writer.flush() > The fact that we flush on close and we flush periodically should be more of a > concern of avoiding data loss; close in case sync is never called and the > periodic flush to guard against having data from slow writers get buffered > for a long time and expose us to risk of loss in case the collector crashes > with data in its buffers. Size-based flush is a different concern to avoid > blowing up memory footprint. > The spooling behavior is also somewhat separate. > We have two separate methods on our API putEntities and putEntitiesAsync and > they should have different behavior beyond waiting for the request to be sent. -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org