[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl
[ https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15369770#comment-15369770 ] Hudson commented on YARN-3908: -- SUCCESS: Integrated in Hadoop-trunk-Commit #10074 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/10074/]) YARN-3908. Fixed bugs in HBaseTimelineWriterImpl. Contributed by (sjlee: rev a9fab9b644e636c1f1b2632130d4eaea70111f16) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/common/Separator.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/test/java/org/apache/hadoop/yarn/server/timelineservice/storage/TestHBaseTimelineWriterImpl.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/entity/EntityColumnPrefix.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/common/ColumnPrefix.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/HBaseTimelineWriterImpl.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/timelineservice/TimelineEvent.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/entity/EntityTable.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/common/ColumnHelper.java > Bugs in HBaseTimelineWriterImpl > --- > > Key: YARN-3908 > URL: https://issues.apache.org/jira/browse/YARN-3908 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Zhijie Shen >Assignee: Vrushali C > Fix For: YARN-2928 > > Attachments: YARN-3908-YARN-2928.001.patch, > YARN-3908-YARN-2928.002.patch, YARN-3908-YARN-2928.003.patch, > YARN-3908-YARN-2928.004.patch, YARN-3908-YARN-2928.004.patch, > YARN-3908-YARN-2928.005.patch > > > 1. In HBaseTimelineWriterImpl, the info column family contains the basic > fields of a timeline entity plus events. However, entity#info map is not > stored at all. > 2 event#timestamp is also not persisted. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl
[ https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14643476#comment-14643476 ] Zhijie Shen commented on YARN-3908: --- Sure, as most folks are comfortable with the latest patch, let's get this in. I'll file a separate jira to track the discussion about event column key. > Bugs in HBaseTimelineWriterImpl > --- > > Key: YARN-3908 > URL: https://issues.apache.org/jira/browse/YARN-3908 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Zhijie Shen >Assignee: Vrushali C > Attachments: YARN-3908-YARN-2928.001.patch, > YARN-3908-YARN-2928.002.patch, YARN-3908-YARN-2928.003.patch, > YARN-3908-YARN-2928.004.patch, YARN-3908-YARN-2928.004.patch, > YARN-3908-YARN-2928.005.patch > > > 1. In HBaseTimelineWriterImpl, the info column family contains the basic > fields of a timeline entity plus events. However, entity#info map is not > stored at all. > 2 event#timestamp is also not persisted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl
[ https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14643446#comment-14643446 ] Vrushali C commented on YARN-3908: -- Yes, let's get the current patch in and continue discussion on the event schema > Bugs in HBaseTimelineWriterImpl > --- > > Key: YARN-3908 > URL: https://issues.apache.org/jira/browse/YARN-3908 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Zhijie Shen >Assignee: Vrushali C > Attachments: YARN-3908-YARN-2928.001.patch, > YARN-3908-YARN-2928.002.patch, YARN-3908-YARN-2928.003.patch, > YARN-3908-YARN-2928.004.patch, YARN-3908-YARN-2928.004.patch, > YARN-3908-YARN-2928.005.patch > > > 1. In HBaseTimelineWriterImpl, the info column family contains the basic > fields of a timeline entity plus events. However, entity#info map is not > stored at all. > 2 event#timestamp is also not persisted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl
[ https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14643424#comment-14643424 ] Sangjin Lee commented on YARN-3908: --- +1 for committing this patch and having the event schema discussion in a different JIRA. > Bugs in HBaseTimelineWriterImpl > --- > > Key: YARN-3908 > URL: https://issues.apache.org/jira/browse/YARN-3908 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Zhijie Shen >Assignee: Vrushali C > Attachments: YARN-3908-YARN-2928.001.patch, > YARN-3908-YARN-2928.002.patch, YARN-3908-YARN-2928.003.patch, > YARN-3908-YARN-2928.004.patch, YARN-3908-YARN-2928.004.patch, > YARN-3908-YARN-2928.005.patch > > > 1. In HBaseTimelineWriterImpl, the info column family contains the basic > fields of a timeline entity plus events. However, entity#info map is not > stored at all. > 2 event#timestamp is also not persisted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl
[ https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14643421#comment-14643421 ] Li Lu commented on YARN-3908: - Just a quick check about the current status of this JIRA. Are we still planning to merge it in ASAP, or we want to fix the row key of timeline events with one more draft, or we plan to fully resolve timeline event problems before we merge it in (if fixing the row key does not fully resolve the problem)? I'd like to know our plan on this JIRA so that I can fine tune my patch for YARN-3904 accordingly. Thanks! > Bugs in HBaseTimelineWriterImpl > --- > > Key: YARN-3908 > URL: https://issues.apache.org/jira/browse/YARN-3908 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Zhijie Shen >Assignee: Vrushali C > Attachments: YARN-3908-YARN-2928.001.patch, > YARN-3908-YARN-2928.002.patch, YARN-3908-YARN-2928.003.patch, > YARN-3908-YARN-2928.004.patch, YARN-3908-YARN-2928.004.patch, > YARN-3908-YARN-2928.005.patch > > > 1. In HBaseTimelineWriterImpl, the info column family contains the basic > fields of a timeline entity plus events. However, entity#info map is not > stored at all. > 2 event#timestamp is also not persisted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl
[ https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14643267#comment-14643267 ] Zhijie Shen commented on YARN-3908: --- Okay, it's fair point. It seems that the key design significantly depends on how we want to operate on the events. The current key design is most friendly to check if there exists the events who match the given event ID to match some given info key (and its value). But if you want to fetch everything that belongs to this event (our query needs to do this, as it's implicitly an atomic unit for now), it seems to be inevitable to scan through all these columns that have the given event ID (correct me if I'm wrong :-). If so, there seems to to have little gain from this key design, while complicating the event encapsulation logic. And after rethinking of the current query to support (YARN-3051), I want to amend my suggestion. It seems to be more reasonable to use {{e!eventTimestamp?eventId?eventInfoKey}}, such that we can natively scan through the events of one entity one-by-one return them in a chronological order. > Bugs in HBaseTimelineWriterImpl > --- > > Key: YARN-3908 > URL: https://issues.apache.org/jira/browse/YARN-3908 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Zhijie Shen >Assignee: Vrushali C > Attachments: YARN-3908-YARN-2928.001.patch, > YARN-3908-YARN-2928.002.patch, YARN-3908-YARN-2928.003.patch, > YARN-3908-YARN-2928.004.patch, YARN-3908-YARN-2928.004.patch, > YARN-3908-YARN-2928.005.patch > > > 1. In HBaseTimelineWriterImpl, the info column family contains the basic > fields of a timeline entity plus events. However, entity#info map is not > stored at all. > 2 event#timestamp is also not persisted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl
[ https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14643188#comment-14643188 ] Vrushali C commented on YARN-3908: -- Hi Zhijie Thanks.. that is a good point. But if we put the event timestamp first, we have no way of querying for a particular event key unless we know the exact timestamp. I think knowing the exact time is probably almost impossible. Imagine that there is another event that occurs between the two kill events, so it has a timestamp > kill1 and < kill2. Now we still have to fetch all those and filter them out. So placing the timestamp first does not help in this case. But if we have the event key first, the columns will be placed together and the event timestamps will be stored in a chronological order (using the long.max - ts value). So the first one being fetched for kill event would be the latest for that event key. thanks Vrushali > Bugs in HBaseTimelineWriterImpl > --- > > Key: YARN-3908 > URL: https://issues.apache.org/jira/browse/YARN-3908 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Zhijie Shen >Assignee: Vrushali C > Attachments: YARN-3908-YARN-2928.001.patch, > YARN-3908-YARN-2928.002.patch, YARN-3908-YARN-2928.003.patch, > YARN-3908-YARN-2928.004.patch, YARN-3908-YARN-2928.004.patch, > YARN-3908-YARN-2928.005.patch > > > 1. In HBaseTimelineWriterImpl, the info column family contains the basic > fields of a timeline entity plus events. However, entity#info map is not > stored at all. > 2 event#timestamp is also not persisted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl
[ https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14643113#comment-14643113 ] Zhijie Shen commented on YARN-3908: --- [~vrushalic], thanks for fixing the problem. W.R.T the column key, shall we use: {code} e!eventId?eventTimestamp?eventInfoKey : eventInfoValue {code} Image we have two KILL events: one on TS1 and the other on TS2. IMHO, we want to scan through the two events' columns one-by-one instead of in a interleaved manner. This will make reader to parse multiple events much easier and encapsulate them one after the other. It will be more useful in the future if we want to just retrieve part of the events of a big job (e.g. within a given time window or the most recent events). Thoughts? > Bugs in HBaseTimelineWriterImpl > --- > > Key: YARN-3908 > URL: https://issues.apache.org/jira/browse/YARN-3908 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Zhijie Shen >Assignee: Vrushali C > Attachments: YARN-3908-YARN-2928.001.patch, > YARN-3908-YARN-2928.002.patch, YARN-3908-YARN-2928.003.patch, > YARN-3908-YARN-2928.004.patch, YARN-3908-YARN-2928.004.patch, > YARN-3908-YARN-2928.005.patch > > > 1. In HBaseTimelineWriterImpl, the info column family contains the basic > fields of a timeline entity plus events. However, entity#info map is not > stored at all. > 2 event#timestamp is also not persisted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl
[ https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14643048#comment-14643048 ] Li Lu commented on YARN-3908: - Hi [~sjlee0], I'm OK with checking this in and address the event schema problem in a separate JIRA. > Bugs in HBaseTimelineWriterImpl > --- > > Key: YARN-3908 > URL: https://issues.apache.org/jira/browse/YARN-3908 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Zhijie Shen >Assignee: Vrushali C > Attachments: YARN-3908-YARN-2928.001.patch, > YARN-3908-YARN-2928.002.patch, YARN-3908-YARN-2928.003.patch, > YARN-3908-YARN-2928.004.patch, YARN-3908-YARN-2928.004.patch, > YARN-3908-YARN-2928.005.patch > > > 1. In HBaseTimelineWriterImpl, the info column family contains the basic > fields of a timeline entity plus events. However, entity#info map is not > stored at all. > 2 event#timestamp is also not persisted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl
[ https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14642860#comment-14642860 ] Sangjin Lee commented on YARN-3908: --- Thanks for the update [~vrushalic]. Are folks OK with this going in as is and making further changes to the event schema in a separate ticket? Let me know, and if everyone is fine, I'll merge this patch later today. > Bugs in HBaseTimelineWriterImpl > --- > > Key: YARN-3908 > URL: https://issues.apache.org/jira/browse/YARN-3908 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Zhijie Shen >Assignee: Vrushali C > Attachments: YARN-3908-YARN-2928.001.patch, > YARN-3908-YARN-2928.002.patch, YARN-3908-YARN-2928.003.patch, > YARN-3908-YARN-2928.004.patch, YARN-3908-YARN-2928.004.patch, > YARN-3908-YARN-2928.005.patch > > > 1. In HBaseTimelineWriterImpl, the info column family contains the basic > fields of a timeline entity plus events. However, entity#info map is not > stored at all. > 2 event#timestamp is also not persisted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl
[ https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14641211#comment-14641211 ] Vrushali C commented on YARN-3908: -- Sounds good, the current patch can be checked in and I can continue working on it later.. > Bugs in HBaseTimelineWriterImpl > --- > > Key: YARN-3908 > URL: https://issues.apache.org/jira/browse/YARN-3908 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Zhijie Shen >Assignee: Vrushali C > Attachments: YARN-3908-YARN-2928.001.patch, > YARN-3908-YARN-2928.002.patch, YARN-3908-YARN-2928.003.patch, > YARN-3908-YARN-2928.004.patch, YARN-3908-YARN-2928.004.patch, > YARN-3908-YARN-2928.005.patch > > > 1. In HBaseTimelineWriterImpl, the info column family contains the basic > fields of a timeline entity plus events. However, entity#info map is not > stored at all. > 2 event#timestamp is also not persisted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl
[ https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14641136#comment-14641136 ] Li Lu commented on YARN-3908: - bq. It would be good to get the patch for YARN-3908 checked in (and then continue further discussion if needed). I agree. After checking this in we can rebase all ongoing patches for the storage implementations. Specifically, I'd like to wait this patch to go in before I upload a patch in YARN-3904, which handles both online and offline storage implementations. > Bugs in HBaseTimelineWriterImpl > --- > > Key: YARN-3908 > URL: https://issues.apache.org/jira/browse/YARN-3908 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Zhijie Shen >Assignee: Vrushali C > Attachments: YARN-3908-YARN-2928.001.patch, > YARN-3908-YARN-2928.002.patch, YARN-3908-YARN-2928.003.patch, > YARN-3908-YARN-2928.004.patch, YARN-3908-YARN-2928.004.patch, > YARN-3908-YARN-2928.005.patch > > > 1. In HBaseTimelineWriterImpl, the info column family contains the basic > fields of a timeline entity plus events. However, entity#info map is not > stored at all. > 2 event#timestamp is also not persisted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl
[ https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14641114#comment-14641114 ] Joep Rottinghuis commented on YARN-3908: It would be good to get the patch for YARN-3908 checked in (and then continue further discussion if needed). Otherwise we'll end up having more and more patches in flight that all end up modifying the same code and that will require merges. > Bugs in HBaseTimelineWriterImpl > --- > > Key: YARN-3908 > URL: https://issues.apache.org/jira/browse/YARN-3908 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Zhijie Shen >Assignee: Vrushali C > Attachments: YARN-3908-YARN-2928.001.patch, > YARN-3908-YARN-2928.002.patch, YARN-3908-YARN-2928.003.patch, > YARN-3908-YARN-2928.004.patch, YARN-3908-YARN-2928.004.patch, > YARN-3908-YARN-2928.005.patch > > > 1. In HBaseTimelineWriterImpl, the info column family contains the basic > fields of a timeline entity plus events. However, entity#info map is not > stored at all. > 2 event#timestamp is also not persisted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl
[ https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14641026#comment-14641026 ] Vrushali C commented on YARN-3908: -- Thanks everyone for the discussion. I will upload another patch on this soon. Sangjin, Joep and I also had some more offline discussions on this over the last few days. We considered two options: 1. store the event timestamp as the hbase cell timestamp. 2. store the event timestamp as part of the column key. In the first approach, it is easier to query for time range queries, for example, give me the events that occurred in this time range. The column names look cleaner too. The downside of the first approach is that, we need to setup the column family info to keep multiple versions and ensure other columns than the event columns don't store multiple versions, which is not a very clean way to store it. Yet another option is to store event information in the metrics family but that does not actually belong in that column family, so we are mixing things, which will make it harder while aggregating metrics. So based on these points, we plan to go with approach #2 : storing the event timestamp as part of the column key. I will be making some changes to this patch accordingly. The event information will be stored in the info column family. The timestamps will be part of the column name. So it will be stored as: {code} e!eventId?eventInfoKey?eventTimestamp : eventInfoValue {code} For reader: There is a {code} org.apache.hadoop.hbase.filter.ColumnPrefixFilter {code} which can be used to scan specific column keys. Wrt to chronological ordering, there needs to be some filtering in the reader code to pick the event info key-values that belong to the latest timestamp. For example, in the eg given by [~zjshen] above: b.q. i think proper logic is: if we put and , we should have two separate records persisted; and if we put and again, we should update the same record and let k1=v1'. Yes, this will be stored as you describe. But, for reading, we will get back all values that belong to all event timestamps since they will be part of the column key , so now reader needs to know which ones to return. > Bugs in HBaseTimelineWriterImpl > --- > > Key: YARN-3908 > URL: https://issues.apache.org/jira/browse/YARN-3908 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Zhijie Shen >Assignee: Vrushali C > Attachments: YARN-3908-YARN-2928.001.patch, > YARN-3908-YARN-2928.002.patch, YARN-3908-YARN-2928.003.patch, > YARN-3908-YARN-2928.004.patch, YARN-3908-YARN-2928.004.patch, > YARN-3908-YARN-2928.005.patch > > > 1. In HBaseTimelineWriterImpl, the info column family contains the basic > fields of a timeline entity plus events. However, entity#info map is not > stored at all. > 2 event#timestamp is also not persisted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl
[ https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637300#comment-14637300 ] Zhijie Shen commented on YARN-3908: --- bq. Is it the event id + timestamp? How about the event type? If you look at the equals() and the hashCode() implementations of TimelineEvent, it uses the timestamp, the event type, and even the info as a whole, but the id is not used for equality. How does that square with the stated intent that the event id and the timestamp form the identity? There's no event type now. In v1, it's called type, but in v2 is renamed to id. We want to use id + ts to identify an event object uniquely to support the case that an event happens multiple times. And we can avoid the combination ID like "container_allocation_13421543243". Does this make sense? bq. Is pretty much the only access pattern "give me all the events that belong to this entity"? Yeah, get the events in chronological order of one entity, or just getting part of them via filtering. bq. Two TimelineEvents are equal only if the timestamp is equal AND the type is equal AND the entire info maps are equal. What would we query by event type, timestamp and event info key? Do users always have to specify the timestamp? There's no type, but only ID. In the current reader API, we cannot do sub-entity filtering, but in the future, we can try to support , for example, getting the events in a given time window. If two event has the same , but different info, we may consider them as the same event, but carry different information. The latter put one will append more k/v pairs or update the existing ones. bq. Do we need to store only the latest event for each timestamp, or all of them? It would almost sound like the key should be type and timestamp, but what about the entire event info map? In DB, i think proper logic is: if we put and , we should have two separate records persisted; and if we put and again, we should update the same record and let k1=v1'. > Bugs in HBaseTimelineWriterImpl > --- > > Key: YARN-3908 > URL: https://issues.apache.org/jira/browse/YARN-3908 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Zhijie Shen >Assignee: Vrushali C > Attachments: YARN-3908-YARN-2928.001.patch, > YARN-3908-YARN-2928.002.patch, YARN-3908-YARN-2928.003.patch, > YARN-3908-YARN-2928.004.patch, YARN-3908-YARN-2928.004.patch, > YARN-3908-YARN-2928.005.patch > > > 1. In HBaseTimelineWriterImpl, the info column family contains the basic > fields of a timeline entity plus events. However, entity#info map is not > stored at all. > 2 event#timestamp is also not persisted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl
[ https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14636399#comment-14636399 ] Joep Rottinghuis commented on YARN-3908: What does this mean for the HBaseTimelineWriterImpl? Two TimelineEvents are equal only if the timestamp is equal AND the type is equal AND the entire info maps are equal. What would we query by event type, timestamp and event info key? Do users always have to specify the timestamp? Do we need to store only the latest event for each timestamp, or all of them? The answer to all these questions are critical for the key design for the TimelineEvent. It would almost sound like the key should be type and timestamp, but what about the entire event info map? > Bugs in HBaseTimelineWriterImpl > --- > > Key: YARN-3908 > URL: https://issues.apache.org/jira/browse/YARN-3908 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Zhijie Shen >Assignee: Vrushali C > Attachments: YARN-3908-YARN-2928.001.patch, > YARN-3908-YARN-2928.002.patch, YARN-3908-YARN-2928.003.patch, > YARN-3908-YARN-2928.004.patch, YARN-3908-YARN-2928.004.patch, > YARN-3908-YARN-2928.005.patch > > > 1. In HBaseTimelineWriterImpl, the info column family contains the basic > fields of a timeline entity plus events. However, entity#info map is not > stored at all. > 2 event#timestamp is also not persisted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl
[ https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14636058#comment-14636058 ] Li Lu commented on YARN-3908: - Hi [~sjlee0], I think the code you posted here belongs to timeline v1 (o.a.h.yarn.api.records.timeline.*), but the v2 version is in o.a.h.yarn.api.records.timelineservice.*. TimelineEvent in v2, modified in YARN-3836, does use id for all related tasks. We're no longer using event info for equality check in that version. > Bugs in HBaseTimelineWriterImpl > --- > > Key: YARN-3908 > URL: https://issues.apache.org/jira/browse/YARN-3908 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Zhijie Shen >Assignee: Vrushali C > Attachments: YARN-3908-YARN-2928.001.patch, > YARN-3908-YARN-2928.002.patch, YARN-3908-YARN-2928.003.patch, > YARN-3908-YARN-2928.004.patch, YARN-3908-YARN-2928.004.patch, > YARN-3908-YARN-2928.005.patch > > > 1. In HBaseTimelineWriterImpl, the info column family contains the basic > fields of a timeline entity plus events. However, entity#info map is not > stored at all. > 2 event#timestamp is also not persisted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl
[ https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14636057#comment-14636057 ] Sangjin Lee commented on YARN-3908: --- Sorry my bad. I mistakenly pulled up the v.1 of {{TimelineEvent}}. Our version uses only the id and the timestamp for equality: {code:title=TimelineEvent.java|borderStyle=solid} @Override public int hashCode() { int result = (int) (timestamp ^ (timestamp >>> 32)); result = 31 * result + id.hashCode(); return result; } @Override public boolean equals(Object o) { if (this == o) return true; if (!(o instanceof TimelineEvent)) return false; TimelineEvent event = (TimelineEvent) o; if (timestamp != event.timestamp) return false; if (!id.equals(event.id)) { return false; } return true; } @Override public int compareTo(TimelineEvent other) { if (timestamp > other.timestamp) { return -1; } else if (timestamp < other.timestamp) { return 1; } else { return id.compareTo(other.id); } } {code} So that answers my first question. Sorry for the confusion! Only the second question remains... > Bugs in HBaseTimelineWriterImpl > --- > > Key: YARN-3908 > URL: https://issues.apache.org/jira/browse/YARN-3908 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Zhijie Shen >Assignee: Vrushali C > Attachments: YARN-3908-YARN-2928.001.patch, > YARN-3908-YARN-2928.002.patch, YARN-3908-YARN-2928.003.patch, > YARN-3908-YARN-2928.004.patch, YARN-3908-YARN-2928.004.patch, > YARN-3908-YARN-2928.005.patch > > > 1. In HBaseTimelineWriterImpl, the info column family contains the basic > fields of a timeline entity plus events. However, entity#info map is not > stored at all. > 2 event#timestamp is also not persisted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl
[ https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14636050#comment-14636050 ] Sangjin Lee commented on YARN-3908: --- Hmm, then isn't this incorrect? {code:title=TimelineEvent.java|borderStyle=solid} @Override public int compareTo(TimelineEvent other) { if (timestamp > other.timestamp) { return -1; } else if (timestamp < other.timestamp) { return 1; } else { return eventType.compareTo(other.eventType); } } @Override public boolean equals(Object o) { if (this == o) return true; if (o == null || getClass() != o.getClass()) return false; TimelineEvent event = (TimelineEvent) o; if (timestamp != event.timestamp) return false; if (!eventType.equals(event.eventType)) return false; if (eventInfo != null ? !eventInfo.equals(event.eventInfo) : event.eventInfo != null) return false; return true; } @Override public int hashCode() { int result = (int) (timestamp ^ (timestamp >>> 32)); result = 31 * result + eventType.hashCode(); result = 31 * result + (eventInfo != null ? eventInfo.hashCode() : 0); return result; } {code} First of all, id is not even used. Instead type is used. Also, event info is part of the equality semantics. > Bugs in HBaseTimelineWriterImpl > --- > > Key: YARN-3908 > URL: https://issues.apache.org/jira/browse/YARN-3908 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Zhijie Shen >Assignee: Vrushali C > Attachments: YARN-3908-YARN-2928.001.patch, > YARN-3908-YARN-2928.002.patch, YARN-3908-YARN-2928.003.patch, > YARN-3908-YARN-2928.004.patch, YARN-3908-YARN-2928.004.patch, > YARN-3908-YARN-2928.005.patch > > > 1. In HBaseTimelineWriterImpl, the info column family contains the basic > fields of a timeline entity plus events. However, entity#info map is not > stored at all. > 2 event#timestamp is also not persisted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl
[ https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14636009#comment-14636009 ] Li Lu commented on YARN-3908: - Hi [~sjlee0], I don't think we're still using event type in new TimelineEvent v2. However, the behavior you mentioned is quite consistent with the v1 TimelineEvent. Could you please double check this? Thanks! > Bugs in HBaseTimelineWriterImpl > --- > > Key: YARN-3908 > URL: https://issues.apache.org/jira/browse/YARN-3908 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Zhijie Shen >Assignee: Vrushali C > Attachments: YARN-3908-YARN-2928.001.patch, > YARN-3908-YARN-2928.002.patch, YARN-3908-YARN-2928.003.patch, > YARN-3908-YARN-2928.004.patch, YARN-3908-YARN-2928.004.patch, > YARN-3908-YARN-2928.005.patch > > > 1. In HBaseTimelineWriterImpl, the info column family contains the basic > fields of a timeline entity plus events. However, entity#info map is not > stored at all. > 2 event#timestamp is also not persisted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl
[ https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14635996#comment-14635996 ] Sangjin Lee commented on YARN-3908: --- [~jrottinghuis], [~vrushalic], and I had offline chats, and we feel that we may need to revisit how we store events. Currently (with this patch) we store the event with the column name "e!eventId?infoKey" and the column value being the info value. The event timestamp is stored as the cell timestamp. We're realizing that this may not be a correct way to store events. I'm basing this on the [discussion|https://issues.apache.org/jira/browse/YARN-3836?focusedCommentId=14619729&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14619729] we had when we talked about the equality and identity semantics of {{TimelineEvent}}. Namely, the id *and* the timestamp form the identity of a {{TimelineEvent}}. Then I think storing the timestamp in the HBase cell timestamp does not work. Some questions for you, [~zjshen] and [~gtCarrera9]. (1) *What defines the identity of a {{TimelineEvent}}?* Is it the event id + timestamp? How about the event type? If you look at the {{equals()}} and the {{hashCode()}} implementations of {{TimelineEvent}}, it uses the timestamp, the event type, and even the info as a whole, but the id is not used for equality. How does that square with the stated intent that the event id and the timestamp form the identity? (2) *What would be the access pattern* for {{TimelineEvents}}?* Is pretty much the only access pattern "give me all the events that belong to this entity"? Also specifically, would you ever query for an event with the id *and* the timestamp? It is not reasonable for readers to be able to provide the event timestamp for queries, right? Would you also query for just the event id? What other access patterns need to be supported? Clarifying those things would help us correctly implement the schema. Thanks! > Bugs in HBaseTimelineWriterImpl > --- > > Key: YARN-3908 > URL: https://issues.apache.org/jira/browse/YARN-3908 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Zhijie Shen >Assignee: Vrushali C > Attachments: YARN-3908-YARN-2928.001.patch, > YARN-3908-YARN-2928.002.patch, YARN-3908-YARN-2928.003.patch, > YARN-3908-YARN-2928.004.patch, YARN-3908-YARN-2928.004.patch, > YARN-3908-YARN-2928.005.patch > > > 1. In HBaseTimelineWriterImpl, the info column family contains the basic > fields of a timeline entity plus events. However, entity#info map is not > stored at all. > 2 event#timestamp is also not persisted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl
[ https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14635754#comment-14635754 ] Sangjin Lee commented on YARN-3908: --- I filed YARN-3949 to address the need for timely flush of writes. > Bugs in HBaseTimelineWriterImpl > --- > > Key: YARN-3908 > URL: https://issues.apache.org/jira/browse/YARN-3908 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Zhijie Shen >Assignee: Vrushali C > Attachments: YARN-3908-YARN-2928.001.patch, > YARN-3908-YARN-2928.002.patch, YARN-3908-YARN-2928.003.patch, > YARN-3908-YARN-2928.004.patch, YARN-3908-YARN-2928.004.patch, > YARN-3908-YARN-2928.005.patch > > > 1. In HBaseTimelineWriterImpl, the info column family contains the basic > fields of a timeline entity plus events. However, entity#info map is not > stored at all. > 2 event#timestamp is also not persisted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl
[ https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14634383#comment-14634383 ] Sangjin Lee commented on YARN-3908: --- I'd like to propose closing this issue with the current patch (provided everyone is happy with the patch of course) with the understanding that the patch fixes all the issues mentioned here with the exception of the flush behavior. I do think handling the flush might be more far reaching than the changes that are being done here, so I'm inclined to create a separate ticket for it. Let me know what you think. > Bugs in HBaseTimelineWriterImpl > --- > > Key: YARN-3908 > URL: https://issues.apache.org/jira/browse/YARN-3908 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Zhijie Shen >Assignee: Vrushali C > Attachments: YARN-3908-YARN-2928.001.patch, > YARN-3908-YARN-2928.002.patch, YARN-3908-YARN-2928.003.patch, > YARN-3908-YARN-2928.004.patch, YARN-3908-YARN-2928.004.patch, > YARN-3908-YARN-2928.005.patch > > > 1. In HBaseTimelineWriterImpl, the info column family contains the basic > fields of a timeline entity plus events. However, entity#info map is not > stored at all. > 2 event#timestamp is also not persisted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl
[ https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14634293#comment-14634293 ] Sangjin Lee commented on YARN-3908: --- Currently we're not calling {{BufferedMutator.flush()}} explicitly. It sounds like that might be a good thing to do. [~jrottinghuis], [~vrushalic], thoughts on calling {{BufferedMutator.flush()}} at the end of the {{write()}} call? Did you consider doing that when you created the hbase writer? > Bugs in HBaseTimelineWriterImpl > --- > > Key: YARN-3908 > URL: https://issues.apache.org/jira/browse/YARN-3908 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Zhijie Shen >Assignee: Vrushali C > Attachments: YARN-3908-YARN-2928.001.patch, > YARN-3908-YARN-2928.002.patch, YARN-3908-YARN-2928.003.patch, > YARN-3908-YARN-2928.004.patch, YARN-3908-YARN-2928.004.patch, > YARN-3908-YARN-2928.005.patch > > > 1. In HBaseTimelineWriterImpl, the info column family contains the basic > fields of a timeline entity plus events. However, entity#info map is not > stored at all. > 2 event#timestamp is also not persisted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl
[ https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14634200#comment-14634200 ] Zhijie Shen commented on YARN-3908: --- I set up hbase-1.0.1.1 as a single node cluster on local FS, submit an MR job, after job got finished, I used the REST API (YARN-3049) to read the entity -> NOT FOUND and I used hbase shell to scan through the entity table -> NOT FOUND as well. We may want to rethink of the buffer policy. It seems not to be a good user experience that after app is finished, the entity is still not available to users. > Bugs in HBaseTimelineWriterImpl > --- > > Key: YARN-3908 > URL: https://issues.apache.org/jira/browse/YARN-3908 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Zhijie Shen >Assignee: Vrushali C > Attachments: YARN-3908-YARN-2928.001.patch, > YARN-3908-YARN-2928.002.patch, YARN-3908-YARN-2928.003.patch, > YARN-3908-YARN-2928.004.patch, YARN-3908-YARN-2928.004.patch, > YARN-3908-YARN-2928.005.patch > > > 1. In HBaseTimelineWriterImpl, the info column family contains the basic > fields of a timeline entity plus events. However, entity#info map is not > stored at all. > 2 event#timestamp is also not persisted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl
[ https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14633937#comment-14633937 ] Sangjin Lee commented on YARN-3908: --- bq. 1. The events have been written into metrics column family. Correct, that was another bug, and it's been corrected with the patch. Have you tried the patch to see if the problem has been fixed? bq. 2. The entity is not accessible immediately after a single put operation. Could you kindly elaborate bit more on under what condition this happens? IIUC, the current mechanism is using the buffered mutator (http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/BufferedMutator.html), and the writes are flushed to the HBase table in a batch asynchronous manner. Perhaps you're encountering that behavior? > Bugs in HBaseTimelineWriterImpl > --- > > Key: YARN-3908 > URL: https://issues.apache.org/jira/browse/YARN-3908 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Zhijie Shen >Assignee: Vrushali C > Attachments: YARN-3908-YARN-2928.001.patch, > YARN-3908-YARN-2928.002.patch, YARN-3908-YARN-2928.003.patch, > YARN-3908-YARN-2928.004.patch, YARN-3908-YARN-2928.004.patch, > YARN-3908-YARN-2928.005.patch > > > 1. In HBaseTimelineWriterImpl, the info column family contains the basic > fields of a timeline entity plus events. However, entity#info map is not > stored at all. > 2 event#timestamp is also not persisted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl
[ https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14633936#comment-14633936 ] Sangjin Lee commented on YARN-3908: --- The JIRA site seems offline right now. bq. 1. The events have been written into metrics column family. Correct, that was another bug, and it's been corrected with the patch. Have you tried the patch to see if the problem has been fixed? bq. 2. The entity is not accessible immediately after a single put operation. Could you kindly elaborate bit more on under what condition this happens? IIUC, the current mechanism is using the buffered mutator ( http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/BufferedMutator.html), and the writes are flushed to the HBase table in a batch asynchronous manner. Perhaps you're encountering that behavior? On Mon, Jul 20, 2015 at 10:21 AM, Zhijie Shen (JIRA) > Bugs in HBaseTimelineWriterImpl > --- > > Key: YARN-3908 > URL: https://issues.apache.org/jira/browse/YARN-3908 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Zhijie Shen >Assignee: Vrushali C > Attachments: YARN-3908-YARN-2928.001.patch, > YARN-3908-YARN-2928.002.patch, YARN-3908-YARN-2928.003.patch, > YARN-3908-YARN-2928.004.patch, YARN-3908-YARN-2928.004.patch, > YARN-3908-YARN-2928.005.patch > > > 1. In HBaseTimelineWriterImpl, the info column family contains the basic > fields of a timeline entity plus events. However, entity#info map is not > stored at all. > 2 event#timestamp is also not persisted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl
[ https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14633857#comment-14633857 ] Zhijie Shen commented on YARN-3908: --- I found two more issues upon debugging the reader POC: 1. The events have been written into metrics column family. 2. The entity is not accessible immediately after a single put operation. > Bugs in HBaseTimelineWriterImpl > --- > > Key: YARN-3908 > URL: https://issues.apache.org/jira/browse/YARN-3908 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Zhijie Shen >Assignee: Vrushali C > Attachments: YARN-3908-YARN-2928.001.patch, > YARN-3908-YARN-2928.002.patch, YARN-3908-YARN-2928.003.patch, > YARN-3908-YARN-2928.004.patch, YARN-3908-YARN-2928.004.patch, > YARN-3908-YARN-2928.005.patch > > > 1. In HBaseTimelineWriterImpl, the info column family contains the basic > fields of a timeline entity plus events. However, entity#info map is not > stored at all. > 2 event#timestamp is also not persisted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl
[ https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14632094#comment-14632094 ] Hadoop QA commented on YARN-3908: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | pre-patch | 17m 5s | Findbugs (version ) appears to be broken on YARN-2928. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 8m 3s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 55s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 22s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 1m 18s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 1s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 28s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 40s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 2m 21s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | yarn tests | 0m 25s | Tests passed in hadoop-yarn-api. | | {color:green}+1{color} | yarn tests | 1m 22s | Tests passed in hadoop-yarn-server-timelineservice. | | | | 43m 4s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12745903/YARN-3908-YARN-2928.005.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | YARN-2928 / eb1932d | | hadoop-yarn-api test log | https://builds.apache.org/job/PreCommit-YARN-Build/8578/artifact/patchprocess/testrun_hadoop-yarn-api.txt | | hadoop-yarn-server-timelineservice test log | https://builds.apache.org/job/PreCommit-YARN-Build/8578/artifact/patchprocess/testrun_hadoop-yarn-server-timelineservice.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/8578/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/8578/console | This message was automatically generated. > Bugs in HBaseTimelineWriterImpl > --- > > Key: YARN-3908 > URL: https://issues.apache.org/jira/browse/YARN-3908 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Zhijie Shen >Assignee: Vrushali C > Attachments: YARN-3908-YARN-2928.001.patch, > YARN-3908-YARN-2928.002.patch, YARN-3908-YARN-2928.003.patch, > YARN-3908-YARN-2928.004.patch, YARN-3908-YARN-2928.004.patch, > YARN-3908-YARN-2928.005.patch > > > 1. In HBaseTimelineWriterImpl, the info column family contains the basic > fields of a timeline entity plus events. However, entity#info map is not > stored at all. > 2 event#timestamp is also not persisted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl
[ https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14631826#comment-14631826 ] Sangjin Lee commented on YARN-3908: --- Thanks for the comment [~gtCarrera9]. I agree the name is bit awkward. Let me see if I can rename it to something more appropriate. Will update. > Bugs in HBaseTimelineWriterImpl > --- > > Key: YARN-3908 > URL: https://issues.apache.org/jira/browse/YARN-3908 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Zhijie Shen >Assignee: Vrushali C > Attachments: YARN-3908-YARN-2928.001.patch, > YARN-3908-YARN-2928.002.patch, YARN-3908-YARN-2928.003.patch, > YARN-3908-YARN-2928.004.patch, YARN-3908-YARN-2928.004.patch > > > 1. In HBaseTimelineWriterImpl, the info column family contains the basic > fields of a timeline entity plus events. However, entity#info map is not > stored at all. > 2 event#timestamp is also not persisted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl
[ https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14630769#comment-14630769 ] Li Lu commented on YARN-3908: - Hi [~sjlee0], one nit: the name {{readTimeseriesResults}} looks a little bit confusing if used with timeline events ({{ EntityColumnPrefix.EVENT.readTimeseriesResults(result)}}). It looks to be related to the time series data within a timeline metric. Maybe we can change that into a more general name to accommodate the generic type V (or T)? > Bugs in HBaseTimelineWriterImpl > --- > > Key: YARN-3908 > URL: https://issues.apache.org/jira/browse/YARN-3908 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Zhijie Shen >Assignee: Vrushali C > Attachments: YARN-3908-YARN-2928.001.patch, > YARN-3908-YARN-2928.002.patch, YARN-3908-YARN-2928.003.patch, > YARN-3908-YARN-2928.004.patch, YARN-3908-YARN-2928.004.patch > > > 1. In HBaseTimelineWriterImpl, the info column family contains the basic > fields of a timeline entity plus events. However, entity#info map is not > stored at all. > 2 event#timestamp is also not persisted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl
[ https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14630479#comment-14630479 ] Hadoop QA commented on YARN-3908: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | pre-patch | 18m 5s | Findbugs (version ) appears to be broken on YARN-2928. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 8m 23s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 10m 23s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 24s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 1m 22s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 1s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 28s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 41s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 2m 24s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | yarn tests | 0m 26s | Tests passed in hadoop-yarn-api. | | {color:green}+1{color} | yarn tests | 1m 21s | Tests passed in hadoop-yarn-server-timelineservice. | | | | 45m 2s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12745704/YARN-3908-YARN-2928.004.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | YARN-2928 / eb1932d | | hadoop-yarn-api test log | https://builds.apache.org/job/PreCommit-YARN-Build/8563/artifact/patchprocess/testrun_hadoop-yarn-api.txt | | hadoop-yarn-server-timelineservice test log | https://builds.apache.org/job/PreCommit-YARN-Build/8563/artifact/patchprocess/testrun_hadoop-yarn-server-timelineservice.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/8563/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/8563/console | This message was automatically generated. > Bugs in HBaseTimelineWriterImpl > --- > > Key: YARN-3908 > URL: https://issues.apache.org/jira/browse/YARN-3908 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Zhijie Shen >Assignee: Vrushali C > Attachments: YARN-3908-YARN-2928.001.patch, > YARN-3908-YARN-2928.002.patch, YARN-3908-YARN-2928.003.patch, > YARN-3908-YARN-2928.004.patch, YARN-3908-YARN-2928.004.patch > > > 1. In HBaseTimelineWriterImpl, the info column family contains the basic > fields of a timeline entity plus events. However, entity#info map is not > stored at all. > 2 event#timestamp is also not persisted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl
[ https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14630289#comment-14630289 ] Joep Rottinghuis commented on YARN-3908: Patch looks good with one comment. I completely overlooked the event info map, because it isn't part of the javadoc on the EntityTable. I should have double-checked but didn't. Thanks for catching this. [~sjlee0] I think it would be good to update the javadoc that describes the EntityTable in the EntityTable.java file. The same is probably missing from the doc "Timeline service schema for native HBase tables" (not sure which jira the PDF for that doc is attached to), because I copied it from the code. I don't think that the application table has been copied yet, so it won't be missing from there. > Bugs in HBaseTimelineWriterImpl > --- > > Key: YARN-3908 > URL: https://issues.apache.org/jira/browse/YARN-3908 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Zhijie Shen >Assignee: Vrushali C > Attachments: YARN-3908-YARN-2928.001.patch, > YARN-3908-YARN-2928.002.patch, YARN-3908-YARN-2928.003.patch > > > 1. In HBaseTimelineWriterImpl, the info column family contains the basic > fields of a timeline entity plus events. However, entity#info map is not > stored at all. > 2 event#timestamp is also not persisted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl
[ https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14630275#comment-14630275 ] Joep Rottinghuis commented on YARN-3908: bq. In fact, I'm wondering if we should but info and events into a separate column family like what we did for configs/metrics? In principle we should keep everything in the same column family (fewer store files) unless: a) The items that we store require a different TTL, compression, etc. This is the case for metrics where we need a separate TTL. b) The columns are rather significant in size, and in many queries they'll be skipped (and specifically not used in push-down predicate ie. column value filters etc). This is the case for configuration. If we have many queries to just retrieve info fields and we skip configs in these, then iterating over just the rows in the info column family will have a benefit of not needing to access the config store files. Otherwise a separate column family just results in more store files and doesn't really gain us anything. Given the current code setup, switching column family is almost trivial, so given that there are no functionality differences, I'd say let's not even try to further optimize this until we have way more code in place. Then we can run large batches of historical job history files and other inputs (perhaps porting data from ATS v1) and then we can see the potential benefit or downside. The other reason to not do premature optimization is that I'm still thinking of adding a few more perf tweaks. Those would also just be performance optimizations, and not any functionality different, so also not a priority now. We should look at tuning all those things much later and together in a coherent way. Additional settings that we need to test are RPC compression, encoding of the store files and/or compression of the same. In short, let's focus on completing functionality and then tinker with these settings later. > Bugs in HBaseTimelineWriterImpl > --- > > Key: YARN-3908 > URL: https://issues.apache.org/jira/browse/YARN-3908 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Zhijie Shen >Assignee: Vrushali C > Attachments: YARN-3908-YARN-2928.001.patch, > YARN-3908-YARN-2928.002.patch, YARN-3908-YARN-2928.003.patch > > > 1. In HBaseTimelineWriterImpl, the info column family contains the basic > fields of a timeline entity plus events. However, entity#info map is not > stored at all. > 2 event#timestamp is also not persisted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl
[ https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14625502#comment-14625502 ] Sangjin Lee commented on YARN-3908: --- bq. 1. In fact, I'm wondering if we should but info and events into a separate column family like what we did for configs/metrics? [~vrushalic], [~jrottinghuis], your thoughts on this? > Bugs in HBaseTimelineWriterImpl > --- > > Key: YARN-3908 > URL: https://issues.apache.org/jira/browse/YARN-3908 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Zhijie Shen >Assignee: Vrushali C > Attachments: YARN-3908-YARN-2928.001.patch, > YARN-3908-YARN-2928.002.patch, YARN-3908-YARN-2928.003.patch > > > 1. In HBaseTimelineWriterImpl, the info column family contains the basic > fields of a timeline entity plus events. However, entity#info map is not > stored at all. > 2 event#timestamp is also not persisted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl
[ https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14625414#comment-14625414 ] Sangjin Lee commented on YARN-3908: --- When a time series data expires after the TTL (except for the latest value), it will only contain a single value. For all practical purposes, the metric at that point would act like a single value. We thought that it would be fine. Do you see a situation where (probably on the read path) we need to recognize some metric as a time series and do something different *even though* there is only one value in the column? > Bugs in HBaseTimelineWriterImpl > --- > > Key: YARN-3908 > URL: https://issues.apache.org/jira/browse/YARN-3908 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Zhijie Shen >Assignee: Vrushali C > Attachments: YARN-3908-YARN-2928.001.patch, > YARN-3908-YARN-2928.002.patch, YARN-3908-YARN-2928.003.patch > > > 1. In HBaseTimelineWriterImpl, the info column family contains the basic > fields of a timeline entity plus events. However, entity#info map is not > stored at all. > 2 event#timestamp is also not persisted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl
[ https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14625397#comment-14625397 ] Zhijie Shen commented on YARN-3908: --- Yeah, but the method based on metric value number is not guaranteed, are we okay with it? > Bugs in HBaseTimelineWriterImpl > --- > > Key: YARN-3908 > URL: https://issues.apache.org/jira/browse/YARN-3908 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Zhijie Shen >Assignee: Vrushali C > Attachments: YARN-3908-YARN-2928.001.patch, > YARN-3908-YARN-2928.002.patch, YARN-3908-YARN-2928.003.patch > > > 1. In HBaseTimelineWriterImpl, the info column family contains the basic > fields of a timeline entity plus events. However, entity#info map is not > stored at all. > 2 event#timestamp is also not persisted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl
[ https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14625387#comment-14625387 ] Sangjin Lee commented on YARN-3908: --- bq. 2. We don't want to store the metric type, do we? Maybe I was mistaken when I read your comment that said "I also realized that the metric type is not persisted too." I took it to mean that you're suggesting that we persist it. I also had an offline chat with [~vrushalic], and she clarified that we probably do not need to persist the metric type and that we can distinguish between a single value metric vs. time series as you described. We still need to make some changes to ensure that the hbase writer sets the right min version when it writes a single-value metric (currently it's not being done). > Bugs in HBaseTimelineWriterImpl > --- > > Key: YARN-3908 > URL: https://issues.apache.org/jira/browse/YARN-3908 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Zhijie Shen >Assignee: Vrushali C > Attachments: YARN-3908-YARN-2928.001.patch, > YARN-3908-YARN-2928.002.patch, YARN-3908-YARN-2928.003.patch > > > 1. In HBaseTimelineWriterImpl, the info column family contains the basic > fields of a timeline entity plus events. However, entity#info map is not > stored at all. > 2 event#timestamp is also not persisted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl
[ https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14625369#comment-14625369 ] Zhijie Shen commented on YARN-3908: --- [~vrushalic] and [~sjlee0], thanks for helping fix the problems. I've two questions: 1. In fact, I'm wondering if we should but info and events into a separate column family like what we did for configs/metrics? 2. We don't want to store the metric type, do we? > Bugs in HBaseTimelineWriterImpl > --- > > Key: YARN-3908 > URL: https://issues.apache.org/jira/browse/YARN-3908 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Zhijie Shen >Assignee: Vrushali C > Attachments: YARN-3908-YARN-2928.001.patch, > YARN-3908-YARN-2928.002.patch, YARN-3908-YARN-2928.003.patch > > > 1. In HBaseTimelineWriterImpl, the info column family contains the basic > fields of a timeline entity plus events. However, entity#info map is not > stored at all. > 2 event#timestamp is also not persisted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl
[ https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14624973#comment-14624973 ] Sangjin Lee commented on YARN-3908: --- I would appreciate your review on this. Thanks! > Bugs in HBaseTimelineWriterImpl > --- > > Key: YARN-3908 > URL: https://issues.apache.org/jira/browse/YARN-3908 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Zhijie Shen >Assignee: Vrushali C > Attachments: YARN-3908-YARN-2928.001.patch, > YARN-3908-YARN-2928.002.patch, YARN-3908-YARN-2928.003.patch > > > 1. In HBaseTimelineWriterImpl, the info column family contains the basic > fields of a timeline entity plus events. However, entity#info map is not > stored at all. > 2 event#timestamp is also not persisted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl
[ https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14623180#comment-14623180 ] Hadoop QA commented on YARN-3908: - \\ \\ | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 17m 3s | Pre-patch YARN-2928 compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 7m 48s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 44s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 23s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 1m 23s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 1s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 37s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 40s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 2m 23s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | yarn tests | 0m 25s | Tests passed in hadoop-yarn-api. | | {color:green}+1{color} | yarn tests | 1m 21s | Tests passed in hadoop-yarn-server-timelineservice. | | | | 42m 51s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12744857/YARN-3908-YARN-2928.003.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | YARN-2928 / 2d4a8f4 | | hadoop-yarn-api test log | https://builds.apache.org/job/PreCommit-YARN-Build/8505/artifact/patchprocess/testrun_hadoop-yarn-api.txt | | hadoop-yarn-server-timelineservice test log | https://builds.apache.org/job/PreCommit-YARN-Build/8505/artifact/patchprocess/testrun_hadoop-yarn-server-timelineservice.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/8505/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/8505/console | This message was automatically generated. > Bugs in HBaseTimelineWriterImpl > --- > > Key: YARN-3908 > URL: https://issues.apache.org/jira/browse/YARN-3908 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Zhijie Shen >Assignee: Vrushali C > Attachments: YARN-3908-YARN-2928.001.patch, > YARN-3908-YARN-2928.002.patch, YARN-3908-YARN-2928.003.patch > > > 1. In HBaseTimelineWriterImpl, the info column family contains the basic > fields of a timeline entity plus events. However, entity#info map is not > stored at all. > 2 event#timestamp is also not persisted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl
[ https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14623133#comment-14623133 ] Hadoop QA commented on YARN-3908: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | pre-patch | 16m 52s | Findbugs (version ) appears to be broken on YARN-2928. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:red}-1{color} | javac | 7m 46s | The applied patch generated 1 additional warning messages. | | {color:green}+1{color} | javadoc | 9m 48s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 23s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 0m 58s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 1s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 39s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 41s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 2m 20s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | yarn tests | 0m 26s | Tests passed in hadoop-yarn-api. | | {color:green}+1{color} | yarn tests | 1m 22s | Tests passed in hadoop-yarn-server-timelineservice. | | | | 42m 19s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12744831/YARN-3908-YARN-2928.002.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | YARN-2928 / 2d4a8f4 | | javac | https://builds.apache.org/job/PreCommit-YARN-Build/8502/artifact/patchprocess/diffJavacWarnings.txt | | hadoop-yarn-api test log | https://builds.apache.org/job/PreCommit-YARN-Build/8502/artifact/patchprocess/testrun_hadoop-yarn-api.txt | | hadoop-yarn-server-timelineservice test log | https://builds.apache.org/job/PreCommit-YARN-Build/8502/artifact/patchprocess/testrun_hadoop-yarn-server-timelineservice.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/8502/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/8502/console | This message was automatically generated. > Bugs in HBaseTimelineWriterImpl > --- > > Key: YARN-3908 > URL: https://issues.apache.org/jira/browse/YARN-3908 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Zhijie Shen >Assignee: Vrushali C > Attachments: YARN-3908-YARN-2928.001.patch, > YARN-3908-YARN-2928.002.patch > > > 1. In HBaseTimelineWriterImpl, the info column family contains the basic > fields of a timeline entity plus events. However, entity#info map is not > stored at all. > 2 event#timestamp is also not persisted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl
[ https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14622987#comment-14622987 ] Sangjin Lee commented on YARN-3908: --- A couple of comments on the latest patch: I find that essentially the event timestamp is stored in the same manner as a metric timestamp. So the mechanism of retrieving the event timestamp is nearly identical to {{ColumnPrefix.readTimeseriesResults()}}. The only restriction of the previous version of that method is that it explicitly cast the values to {{Number}}. In the patch I genericized the method to handle any value type. Having said that, I recognize that reading an event using a method named {{readTimeseriesResults()}} is rather awkward. But alternatives are not great either. {{ColumnPrefix}} is an enum, and it is quite awkward to introduce a method that is useful for only one value of that enum. Perhaps we could create a static wrapper method to bridge that gap. Let me know what you think. Finally, I find that we're still not persisting the metric types. I could be wrong but it appears that all metrics are treated as time series when they are stored. I'll see if it would be straightforward to implement that piece, but it could be bit involved. How about capturing that in its own subtask? > Bugs in HBaseTimelineWriterImpl > --- > > Key: YARN-3908 > URL: https://issues.apache.org/jira/browse/YARN-3908 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Zhijie Shen >Assignee: Vrushali C > Attachments: YARN-3908-YARN-2928.001.patch, > YARN-3908-YARN-2928.002.patch > > > 1. In HBaseTimelineWriterImpl, the info column family contains the basic > fields of a timeline entity plus events. However, entity#info map is not > stored at all. > 2 event#timestamp is also not persisted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl
[ https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14622727#comment-14622727 ] Sangjin Lee commented on YARN-3908: --- Thanks [~vrushalic]! I'll look at it and see if more changes are needed. Other reviews are welcome too. > Bugs in HBaseTimelineWriterImpl > --- > > Key: YARN-3908 > URL: https://issues.apache.org/jira/browse/YARN-3908 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Zhijie Shen >Assignee: Vrushali C > Attachments: YARN-3908-YARN-2928.001.patch > > > 1. In HBaseTimelineWriterImpl, the info column family contains the basic > fields of a timeline entity plus events. However, entity#info map is not > stored at all. > 2 event#timestamp is also not persisted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl
[ https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14621982#comment-14621982 ] Hadoop QA commented on YARN-3908: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | pre-patch | 15m 38s | Findbugs (version ) appears to be broken on YARN-2928. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 7m 48s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 42s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 23s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 0m 18s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 43s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 40s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 0m 47s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | yarn tests | 1m 22s | Tests passed in hadoop-yarn-server-timelineservice. | | | | 38m 27s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12744663/YARN-3908-YARN-2928.001.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | YARN-2928 / 2d4a8f4 | | hadoop-yarn-server-timelineservice test log | https://builds.apache.org/job/PreCommit-YARN-Build/8494/artifact/patchprocess/testrun_hadoop-yarn-server-timelineservice.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/8494/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/8494/console | This message was automatically generated. > Bugs in HBaseTimelineWriterImpl > --- > > Key: YARN-3908 > URL: https://issues.apache.org/jira/browse/YARN-3908 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Zhijie Shen >Assignee: Vrushali C > Attachments: YARN-3908-YARN-2928.001.patch > > > 1. In HBaseTimelineWriterImpl, the info column family contains the basic > fields of a timeline entity plus events. However, entity#info map is not > stored at all. > 2 event#timestamp is also not persisted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl
[ https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14621592#comment-14621592 ] Zhijie Shen commented on YARN-3908: --- 1. TimelineEvent has a timestamp associated with it. It tells us when the event happened. We should have this information persisted, but unfortunately it seems not. 2. Metric doesn't have a timestamp because the timestamp is associated with each individual value. 3. I also realized that the metric type is not persisted too. Now I just assume if size(metric) > 1 => time series, else => single value in reader implementation. But it may not be guaranteed. > Bugs in HBaseTimelineWriterImpl > --- > > Key: YARN-3908 > URL: https://issues.apache.org/jira/browse/YARN-3908 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Zhijie Shen >Assignee: Vrushali C > > 1. In HBaseTimelineWriterImpl, the info column family contains the basic > fields of a timeline entity plus events. However, entity#info map is not > stored at all. > 2 event#timestamp is also not persisted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl
[ https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14621571#comment-14621571 ] Vrushali C commented on YARN-3908: -- Hi [~zjshen] I see that event#info is not being stored, but which is the event timestamp that is being referred? Event metrics does store the timestamp per metric. (Also, I will be on vacation starting tomorrow through next week, so checking with Sangjin offline about this.). thanks Vrushali > Bugs in HBaseTimelineWriterImpl > --- > > Key: YARN-3908 > URL: https://issues.apache.org/jira/browse/YARN-3908 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Zhijie Shen >Assignee: Vrushali C > > 1. In HBaseTimelineWriterImpl, the info column family contains the basic > fields of a timeline entity plus events. However, entity#info map is not > stored at all. > 2 event#timestamp is also not persisted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3908) Bugs in HBaseTimelineWriterImpl
[ https://issues.apache.org/jira/browse/YARN-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14621480#comment-14621480 ] Zhijie Shen commented on YARN-3908: --- Assign it to [~vrushalic] by default. Please feel free to rebalance the workload. > Bugs in HBaseTimelineWriterImpl > --- > > Key: YARN-3908 > URL: https://issues.apache.org/jira/browse/YARN-3908 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Zhijie Shen >Assignee: Vrushali C > > 1. In HBaseTimelineWriterImpl, the info column family contains the basic > fields of a timeline entity plus events. However, entity#info map is not > stored at all. > 2 event#timestamp is also not persisted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)