[jira] [Commented] (YARN-3901) Populate flow run data in the flow_run & flow activity tables

Li Lu (JIRA) Tue, 15 Sep 2015 15:07:02 -0700

    [ 
https://issues.apache.org/jira/browse/YARN-3901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14746352#comment-14746352
 ]


Li Lu commented on YARN-3901:
-----------------------------

Thanks [~sjlee0]! 

bq. It's essentially using AtomicLong.compareAndSet(). The few lines around it 
are mostly to keep pace with the current time. I hope that makes sense.
Yes, the algorithm makes sense. I thought JVM may have some special 
optimization with the getAndAdd method on x86 since x86 has a specific 
fetch-and-add (LOCK:XADD). However I'm not sure if this is actually reflected 
in most current JVMs (https://bugs.openjdk.java.net/browse/JDK-6973482). I'm OK 
with both, since the advantage of either one of them in our use case is 
nontrivial (https://blogs.oracle.com/dave/entry/atomic_fetch_and_add_vs). If 
keeping up with current time is important, we need to stick to the CAS based 
solution and _not_ to change it in future. 

bq.  We create a new record in this table any time a new activity is done for a 
given day for a flow.
Thanks. I missed the {{getTopOfTheDayTimestamp}} part. 

bq. The prefix is just cluster! and we can grab from the beginning. What you 
mention would get activities for today only, which is slightly different.
Right. Thanks for the clarification! I meant activities for the past 24 hours. 

> Populate flow run data in the flow_run & flow activity tables
> -------------------------------------------------------------
>
>                 Key: YARN-3901
>                 URL: https://issues.apache.org/jira/browse/YARN-3901
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>            Reporter: Vrushali C
>            Assignee: Vrushali C
>         Attachments: YARN-3901-YARN-2928.1.patch, 
> YARN-3901-YARN-2928.2.patch, YARN-3901-YARN-2928.3.patch, 
> YARN-3901-YARN-2928.4.patch, YARN-3901-YARN-2928.5.patch, 
> YARN-3901-YARN-2928.6.patch, YARN-3901-YARN-2928.7.patch, 
> YARN-3901-YARN-2928.8.patch
>
>
> As per the schema proposed in YARN-3815 in 
> https://issues.apache.org/jira/secure/attachment/12743391/hbase-schema-proposal-for-aggregation.pdf
> filing jira to track creation and population of data in the flow run table. 
> Some points that are being  considered:
> - Stores per flow run information aggregated across applications, flow version
> RM’s collector writes to on app creation and app completion
> - Per App collector writes to it for metric updates at a slower frequency 
> than the metric updates to application table
> primary key: cluster ! user ! flow ! flow run id
> - Only the latest version of flow-level aggregated metrics will be kept, even 
> if the entity and application level keep a timeseries.
> - The running_apps column will be incremented on app creation, and 
> decremented on app completion.
> - For min_start_time the RM writer will simply write a value with the tag for 
> the applicationId. A coprocessor will return the min value of all written 
> values. - 
> - Upon flush and compactions, the min value between all the cells of this 
> column will be written to the cell without any tag (empty tag) and all the 
> other cells will be discarded.
> - Ditto for the max_end_time, but then the max will be kept.
> - Tags are represented as #type:value. The type can be not set (0), or can 
> indicate running (1) or complete (2). In those cases (for metrics) only 
> complete app metrics are collapsed on compaction.
> - The m! values are aggregated (summed) upon read. Only when applications are 
> completed (indicated by tag type 2) can the values be collapsed.
> - The application ids that have completed and been aggregated into the flow 
> numbers are retained in a separate column for historical tracking: we don’t 
> want to re-aggregate for those upon replay
> 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3901) Populate flow run data in the flow_run & flow activity tables

Reply via email to