[ https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637300#comment-14637300 ]
Zhijie Shen commented on YARN-3908: ----------------------------------- bq. Is it the event id + timestamp? How about the event type? If you look at the equals() and the hashCode() implementations of TimelineEvent, it uses the timestamp, the event type, and even the info as a whole, but the id is not used for equality. How does that square with the stated intent that the event id and the timestamp form the identity? There's no event type now. In v1, it's called type, but in v2 is renamed to id. We want to use id + ts to identify an event object uniquely to support the case that an event happens multiple times. And we can avoid the combination ID like "container_allocation_13421543243". Does this make sense? bq. Is pretty much the only access pattern "give me all the events that belong to this entity"? Yeah, get the events in chronological order of one entity, or just getting part of them via filtering. bq. Two TimelineEvents are equal only if the timestamp is equal AND the type is equal AND the entire info maps are equal. What would we query by event type, timestamp and event info key? Do users always have to specify the timestamp? There's no type, but only ID. In the current reader API, we cannot do sub-entity filtering, but in the future, we can try to support , for example, getting the events in a given time window. If two event has the same <id, ts>, but different info, we may consider them as the same event, but carry different information. The latter put one will append more k/v pairs or update the existing ones. bq. Do we need to store only the latest event for each timestamp, or all of them? It would almost sound like the key should be type and timestamp, but what about the entire event info map? In DB, i think proper logic is: if we put <event1, ts1> and <event1, ts2>, we should have two separate records persisted; and if we put <event1, ts1, info: \[k1=v1, k2=v2\]> and <event1, ts1, info: \[k1=v1'\]> again, we should update the same record and let k1=v1'. > Bugs in HBaseTimelineWriterImpl > ------------------------------- > > Key: YARN-3908 > URL: https://issues.apache.org/jira/browse/YARN-3908 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver > Reporter: Zhijie Shen > Assignee: Vrushali C > Attachments: YARN-3908-YARN-2928.001.patch, > YARN-3908-YARN-2928.002.patch, YARN-3908-YARN-2928.003.patch, > YARN-3908-YARN-2928.004.patch, YARN-3908-YARN-2928.004.patch, > YARN-3908-YARN-2928.005.patch > > > 1. In HBaseTimelineWriterImpl, the info column family contains the basic > fields of a timeline entity plus events. However, entity#info map is not > stored at all. > 2 event#timestamp is also not persisted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)