[ https://issues.apache.org/jira/browse/YARN-3815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14598759#comment-14598759 ]
Joep Rottinghuis commented on YARN-3815: ---------------------------------------- Thanks [~ted_yu] for that link. I did find that code and I'm reading through it. Yes it uses a coprocessor on the reading side to "collapse" values together and permanently "collapse" them together on compaction. I want to use a similar approach here. We cannot use the delta write directly as-is for the following reasons: - For running applications, if we wanted to write only the increment the AM (or ATS writer) will have to keep track of the previous values in order to write the increment only. When the AM crashes and/or the ATS writer restarts we won't know what previous value we had written (and what has already been aggregated. So, we'd have to write the increment plus the latest value. - Ergo, why don't we just write the latest value to begin with and leave off the increment. Now we cannot "collapse" the deltas / latest value until the application is done. Otherwise we would again loose track of what was previously aggregated. So the new approach would be to write the latest value for an app and indicate (using a cell tag) that the app is done and that it can be a collapsed. We would use the co-processor only on the read-side just like with the delta write and that co-processor would aggregate values on the fly for reads and collapse during writes. Those writes would be limited to one single row, so we wouldn't have any weird cross-region locking issues, nor delays and hickups in the write throughput. > [Aggregation] Application/Flow/User/Queue Level Aggregations > ------------------------------------------------------------ > > Key: YARN-3815 > URL: https://issues.apache.org/jira/browse/YARN-3815 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver > Reporter: Junping Du > Assignee: Junping Du > Priority: Critical > Attachments: Timeline Service Nextgen Flow, User, Queue Level > Aggregations (v1).pdf > > > Per previous discussions in some design documents for YARN-2928, the basic > scenario is the query for stats can happen on: > - Application level, expect return: an application with aggregated stats > - Flow level, expect return: aggregated stats for a flow_run, flow_version > and flow > - User level, expect return: aggregated stats for applications submitted by > user > - Queue level, expect return: aggregated stats for applications within the > Queue > Application states is the basic building block for all other level > aggregations. We can provide Flow/User/Queue level aggregated statistics info > based on application states (a dedicated table for application states is > needed which is missing from previous design documents like HBase/Phoenix > schema design). -- This message was sent by Atlassian JIRA (v6.3.4#6332)