[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14547299#comment-14547299 ] Hadoop QA commented on YARN-3051: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | patch | 0m 0s | The patch command could not apply the patch during dryrun. | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12732621/YARN-3051.wip.02.YARN-2928.patch | | Optional Tests | shellcheck javadoc javac unit findbugs checkstyle | | git revision | trunk / cab0dad | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/7963/console | This message was automatically generated. > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: YARN-3051.wip.02.YARN-2928.patch, YARN-3051.wip.patch, > YARN-3051_temp.patch > > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14547304#comment-14547304 ] Li Lu commented on YARN-3051: - Hi [~varun_saxena], I think the new patch name pattern should be, YARN-3051-YARN-2928.***.patch. Would you please try that again? Thanks! > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: YARN-3051.wip.02.YARN-2928.patch, YARN-3051.wip.patch, > YARN-3051_temp.patch > > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14548358#comment-14548358 ] Varun Saxena commented on YARN-3051: Thanks Li for pointing this out. I will anyways updating flow and user based APIs and add a few tests. Will take care of naming the patch this way in next patch > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: YARN-3051.wip.02.YARN-2928.patch, YARN-3051.wip.patch, > YARN-3051_temp.patch > > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14551265#comment-14551265 ] Li Lu commented on YARN-3051: - Hi [~varun_saxena], I just tried to apply the patch against the latest YARN-2928 branch, and there was a problem with pom.xml. When generating the next patch, could you please double check on that? I think it will be great if we can make some progress on the reader side now, so that we can have a working end-to-end v2 preview soon. Thanks! > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: YARN-3051.wip.02.YARN-2928.patch, YARN-3051.wip.patch, > YARN-3051_temp.patch > > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14552534#comment-14552534 ] Varun Saxena commented on YARN-3051: Well, I am still stuck on trying to get the attribute set via HttpServer2#setAttribute in WebServices class. Will update patch once that is done. > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: YARN-3051.wip.02.YARN-2928.patch, YARN-3051.wip.patch, > YARN-3051_temp.patch > > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14553364#comment-14553364 ] Hadoop QA commented on YARN-3051: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 14m 55s | Pre-patch YARN-2928 compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 3 new or modified test files. | | {color:green}+1{color} | javac | 7m 42s | There were no new javac warning messages. | | {color:red}-1{color} | javadoc | 9m 39s | The applied patch generated 6 additional warning messages. | | {color:red}-1{color} | release audit | 0m 19s | The applied patch generated 2 release audit warnings. | | {color:red}-1{color} | checkstyle | 1m 19s | The applied patch generated 23 new checkstyle issues (total was 234, now 257). | | {color:green}+1{color} | shellcheck | 0m 6s | There were no new shellcheck (v0.3.3) issues. | | {color:red}-1{color} | whitespace | 0m 1s | The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix. | | {color:green}+1{color} | install | 1m 40s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 41s | The patch built with eclipse:eclipse. | | {color:red}-1{color} | findbugs | 3m 36s | The patch appears to introduce 6 new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | yarn tests | 0m 25s | Tests passed in hadoop-yarn-api. | | {color:green}+1{color} | yarn tests | 1m 56s | Tests passed in hadoop-yarn-common. | | {color:green}+1{color} | yarn tests | 1m 3s | Tests passed in hadoop-yarn-server-timelineservice. | | | | 43m 47s | | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-yarn-server-timelineservice | | | Found reliance on default encoding in org.apache.hadoop.yarn.server.timelineservice.storage.FileSystemTimelineReaderImpl.getEntities(String, String, String, Long, Long, Long, String, Long, Collection, Collection, Collection, Collection, Collection, EnumSet):in org.apache.hadoop.yarn.server.timelineservice.storage.FileSystemTimelineReaderImpl.getEntities(String, String, String, Long, Long, Long, String, Long, Collection, Collection, Collection, Collection, Collection, EnumSet): new java.io.FileReader(File) At FileSystemTimelineReaderImpl.java:[line 88] | | | Found reliance on default encoding in org.apache.hadoop.yarn.server.timelineservice.storage.FileSystemTimelineReaderImpl.getEntity(String, String, String, String, Collection, Collection, Long, Long, EnumSet):in org.apache.hadoop.yarn.server.timelineservice.storage.FileSystemTimelineReaderImpl.getEntity(String, String, String, String, Collection, Collection, Long, Long, EnumSet): new java.io.FileReader(File) At FileSystemTimelineReaderImpl.java:[line 68] | | FindBugs | module:hadoop-yarn-common | | | Inconsistent synchronization of org.apache.hadoop.yarn.api.protocolrecords.impl.pb.AllocateResponsePBImpl.builder; locked 92% of time Unsynchronized access at AllocateResponsePBImpl.java:92% of time Unsynchronized access at AllocateResponsePBImpl.java:[line 391] | | | Inconsistent synchronization of org.apache.hadoop.yarn.api.protocolrecords.impl.pb.AllocateResponsePBImpl.proto; locked 94% of time Unsynchronized access at AllocateResponsePBImpl.java:94% of time Unsynchronized access at AllocateResponsePBImpl.java:[line 391] | | | Inconsistent synchronization of org.apache.hadoop.yarn.api.protocolrecords.impl.pb.AllocateResponsePBImpl.viaProto; locked 94% of time Unsynchronized access at AllocateResponsePBImpl.java:94% of time Unsynchronized access at AllocateResponsePBImpl.java:[line 391] | | FindBugs | module:hadoop-yarn-api | | | org.apache.hadoop.yarn.api.records.timelineservice.TimelineMetric$1.compare(Long, Long) negates the return value of Long.compareTo(Long) At TimelineMetric.java:value of Long.compareTo(Long) At TimelineMetric.java:[line 47] | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12734255/YARN-3051-YARN-2928.03.patch | | Optional Tests | shellcheck javadoc javac unit findbugs checkstyle | | git revision | YARN-2928 / 463e070 | | javadoc | https://builds.apache.org/job/PreCommit-YARN-Build/8034/artifact/patchprocess/diffJavadocWarnings.txt | | Release Audit | https://builds.apache.org/job/PreCommit-YARN-Build/8034/artifact/patchprocess/patchReleaseAuditProblems.txt | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/8034/artifact/patchprocess/diffcheckstylehadoop-yarn-api.txt | | whitespace | https://builds.apache.org/job/PreCommit-YARN-Build/8034/artifact/patchprocess/whitespace.txt | | Findbugs warnings | https://builds.apache.org/job/PreCommit
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14561393#comment-14561393 ] Hadoop QA commented on YARN-3051: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | patch | 0m 0s | The patch command could not apply the patch during dryrun. | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12735644/YARN-3051-YARN-2928.003.patch | | Optional Tests | shellcheck javadoc javac unit findbugs checkstyle | | git revision | YARN-2928 / e19566a | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/8100/console | This message was automatically generated. > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: YARN-3051-YARN-2928.003.patch, > YARN-3051-YARN-2928.03.patch, YARN-3051.wip.02.YARN-2928.patch, > YARN-3051.wip.patch, YARN-3051_temp.patch > > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14561641#comment-14561641 ] Varun Saxena commented on YARN-3051: In the API designed in the patch, there are few things I wanted to discuss. # We can either return a single timeline entity for a flow ID(having aggregated metric values) or multiple entities indicating multiple flows runs for a flow ID. I have included an API for the former as of now. I think there can be uses cases for both though. [~vrushalic], did hRaven have the facility for both kinds of queries ? I mean, is there a known use case ? # Do we plan to include additional info in the user table which can be used for filtering user level entites ? Could not think of any use case but just for flexibility I have added filters in the API {{getUserEntities}}. # I have included an API to query flow information based on the appid. As of now I return the flow to which app belongs to(includes multiple runs) instead of flow run it belongs to. Which is a more viable scenario ? Or we need to support both ? # In the HBase schema design, there are 2 flow summary tables aggregated daily and weekly respectively. So to limit the number of metric records or to see metrics in a specific time window, I have added metric start and metric end timestamps in the API design. But if metrics are aggregated daily and weekly, we wont be able to get something like value of specific metric for a flow from say Thursday 4 pm to Friday 9 am. [~vrushalic], can you confirm ? If this is so, a timestamp doesnt make much sense. Dates can be specified instead. # Will there be queue table(s) in addition to user table(s) ? If yes, how will queue data be aggregated ? Based on entity type ? I may need an additional API for queues then. # The doubt I have regarding flow version will anyways be addressed by YARN-3699 > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: YARN-3051-YARN-2928.003.patch, > YARN-3051-YARN-2928.03.patch, YARN-3051.wip.02.YARN-2928.patch, > YARN-3051.wip.patch, YARN-3051_temp.patch > > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14561770#comment-14561770 ] Vrushali C commented on YARN-3051: -- Hi Varun, Good points.. My answers inline. bq. We can either return a single timeline entity for a flow ID(having aggregated metric values) or multiple entities indicating multiple flows runs for a flow ID. I have included an API for the former as of now. I think there can be uses cases for both though. Vrushali C, did hRaven have the facility for both kinds of queries ? I mean, is there a known use case ? Yes, there are use cases for both. hRaven has apis for both types of calls, they are named differently though. The /flow endpoint in hRaven will return multiple flow runs (limited by filters). The /summary will return aggregated values for all the runs of that flow in that time range filter. Let me give an example (a hadoop sleep job for simplicity). Say user janedoe runs a hadoop sleep job 3 times today and has run it 5 times yesterday and say 6 times on one day about a month back. Now, we may want to see two different things: #1 summarized stats for flow “Sleep job” invoked between last 2 days: It would say this flow was run 8 times, first was at timestamp X, last run was at timestamp Y, it took up a total of N megabytemillis, had a total of M containers across all runs, etc etc. It tells us how much of the cluster capacity a particular flow from a particular user is taking up. -#2 List of flow runs: Will show us details about each flow run. If we say limit = 3 in the query parameters, it would return latest 3 runs of this flow. If we say limit = 100, it would return all the runs in this particular case (including the ones from a month back). If we pass in flowVersion=XXYYZZ, then it would return the list of flows that match this version. For the initial development, I think we may want to work on #2 first (return list of flow runs). The summary api will need aggregated tables which we can add later on, we could file a jira for that, my 2c. bq. Do we plan to include additional info in the user table which can be used for filtering user level entites ? Could not think of any use case but just for flexibility I have added filters in the API getUserEntities. I haven’t looked at the code in detail, but as such, for user level entities, we would want time range, limit on number of records returns, flow name filter, cluster name filter. bq. I have included an API to query flow information based on the appid. As of now I return the flow to which app belongs to(includes multiple runs) instead of flow run it belongs to. Which is a more viable scenario ? Or we need to support both ? An app id can belong to exactly one flow run. App id is the hadoop yarn application id, which should be unique on the cluster. Given an app id, we should be able to look up the exact flow run and return just that. The equivalent api in hRaven is /jobFlow. bq. But if metrics are aggregated daily and weekly, we wont be able to get something like value of specific metric for a flow from say Thursday 4 pm to Friday 9 am. Vrushali C, can you confirm ? If this is so, a timestamp doesnt make much sense. Dates can be specified instead. The thinking is to split the querying across tables. We would query both the daily summary table for the complete day details and the regular flow tables for the details like those of Thursday 4 pm to Friday 9 am. But this does mean aggregating on the query side. So, I think, for starters, we could start off by allowing Date boundaries. We can enhance the API to accept finer timestamps later. bq. Will there be queue table(s) in addition to user table(s) ? If yes, how will queue data be aggregated ? Based on entity type ? I may need an additional API for queues then. Yes, we would need a queue based aggregation table. Right now, those details are to be worked out. So perhaps we can leave aside the queue based APIs (or file a different jira to handle queue based apis). Hope this helps. I can give you more examples if you would like to get more details or have any other questions. I will also look at the patch this week. Also, we should ensure we use the same classes/methods used for key related (flow keys, row keys) construction and parsing across reader apis and writer apis else they will diverge. thanks Vrushali > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: YARN-305
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14562380#comment-14562380 ] Varun Saxena commented on YARN-3051: Thanks for the replies. > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: YARN-3051-YARN-2928.003.patch, > YARN-3051-YARN-2928.03.patch, YARN-3051.wip.02.YARN-2928.patch, > YARN-3051.wip.patch, YARN-3051_temp.patch > > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14568128#comment-14568128 ] Li Lu commented on YARN-3051: - Hi [~varun_saxena], thanks for the work! Not sure if you've already made progress since the latest patch, but I'm posting some of my comments and questions w.r.t the reader API design in the 003 patch. I may have more comments in the near future, but I won't mind to see a new patch before posting them. # I noticed there is a _readerLimit_ for read operations, which works for ATS v1. I'm wondering if it's fine to use -1 to indicate there's no such limit? Not sure if this feature is already there. # The {{fromId}} parameter, we may need to be careful on the concept of "id". In timeline v2 we need context information to identify each entity, such as cluster, user, flow, run. When querying with {{fromId}}, what kind of assumptions should we make on the "id" here? Are we assuming all entities are of the same cluster, user, and/or flow, or the "id" is a concatenation of all information, or it's something else? # For all filters related parameters, I'm not sure if the current object model and storage implementation support a trivial solution. I'd certainly welcome any comments/suggestions on this problem. # Based on the previous two issues, a more general question is, shall we focus on a evolution of the v1 API here, or we start a v2 reader API design from the scratch, and then try to make them compatible to the v1 APIs? The current patch looks to be pursuing the evolution approach. # In some APIs, we're requiring clusterID and appID, but not having flow/run information. In the current writer implementations, this indicates a full table scan. Maybe we can have flow and run information as optional parameters so that we can avoid full table scans when the caller does have flow and run information? # The current APIs require a pretty long list of parameters. For most of the use cases, I think we can abstract something much simpler. Do we plan to add those "simple APIs" in a higher layer? I think having a lot of nulls when calling reader API looks suboptimal, but with only these few APIs we may need to do this frequently? > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: YARN-3051-YARN-2928.003.patch, > YARN-3051-YARN-2928.03.patch, YARN-3051.wip.02.YARN-2928.patch, > YARN-3051.wip.patch, YARN-3051_temp.patch > > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14570941#comment-14570941 ] Varun Saxena commented on YARN-3051: bq. I noticed there is a readerLimit for read operations, which works for ATS v1. I'm wondering if it's fine to use -1 to indicate there's no such limit? Not sure if this feature is already there. You mean limit to limit the number of records ? bq. The fromId parameter, we may need to be careful on the concept of "id". In timeline v2 we need context information to identify each entity, such as cluster, user, flow, run. When querying with fromId, what kind of assumptions should we make on the "id" here? {{fromId}} is primarily there to be backward compatible with ATS v1. It is used in context of entity ID only. This will be documented in the javadoc. I have not changed names of the query params (if these parameters are supported in ATS v1). Whether we need to support same REST endpoints as ATS v1 for the sake of backward compatibility or whether we can break the backward compatibility(in case of no use case) is something which I wanted to discuss. Commented on YARN-3411 as well regarding one such param. bq. In some APIs, we're requiring clusterID and appID, but not having flow/run informationMaybe we can have flow and run information as optional parameters so that we can avoid full table scans when the caller does have flow and run information? Agree with your suggestion. Even I was thinking about including them in the next patch as query params. This will make the parameter list even longer :) bq. The current APIs require a pretty long list of parameters. For most of the use cases, I think we can abstract something much simpler. These parameters are directly fetched from query params coming in REST API and are directly passed down to storage layer(after minor verification). Yes, we can decide on few of the key parameters(which correspond to row key/primary key) and have different methods for that. And have different reader API methods for them as well. > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: YARN-3051-YARN-2928.003.patch, > YARN-3051-YARN-2928.03.patch, YARN-3051.wip.02.YARN-2928.patch, > YARN-3051.wip.patch, YARN-3051_temp.patch > > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14573830#comment-14573830 ] Zhijie Shen commented on YARN-3051: --- [~varun_saxena], thanks for working on the new patch. It seems to be a complete reader side protype, which is nice. I still need some time to take thorough look, but I'd like to my thoughts about the reader APIs. IMHO, we may want to have or start with two sets of APIs: 1) the APIs to query the raw data and 2) the APIs to query the aggregation data. 1) APIs to query the raw data: We would like to have the APIs to let users zoom into the details about their jobs, and give users the freedom to fetch the raw data and do the customized process that ATS will not do. For example, Hive/Pig on Tez need this set of APIs to get the framework specific data, process it and render it on their on web UI. We basically need 2 such APIs. a. Get a single entity given an ID that uniquely locates the entity in the backend (We assume the uniqueness is assured somehow). * This API can be extended or split into multiple sub-APIs to get a single element of the entity, such as events, metrics and configuration. b. Search for a set entities that match the given predicates. * We can start from the predicates that we used in ATS v1 (also for the compatibility purpose), but some of them may no longer apply. * We may want to add more predicates to check the newly added element in v2. * With more predefined semantics, we can even query entities that belong to some container/attempt/application and so on. 2) APIs to query the aggregation data These are complete new in v2 and are the advantage. With the aggregation, we can answer some statistical questions about the job, the user, the queue, the flow and the cluster. These APIs are not directing users to the individual entities put by the application, but returning statistical data (carried by Application|User|Queue|Flow|ClusterEntity). a. Get certain level aggregation data given the ID of the concept on that level, i.e., the job, the user, the queue, the flow and the cluster. b. Search for the the jobs, the users, the queues, the flows and the clusters given predicates. * For the predicates, we could learn from the examples in hRaven. > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: YARN-3051-YARN-2928.003.patch, > YARN-3051-YARN-2928.03.patch, YARN-3051.wip.02.YARN-2928.patch, > YARN-3051.wip.patch, YARN-3051_temp.patch > > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14580858#comment-14580858 ] Varun Saxena commented on YARN-3051: [~zjshen], thanks for your inputs. I will brief you about the APIs' I have decided as of now. # APIs' for querying individual entity/flow/flow run/user and APIs' for querying a set of entities/flow runs/flows/users. APIs' such a set of flows/users will contain aggregated data. The reason for separate endpoints for entities, flows, users,etc. is because of the different tables in HBase/Phoenix schema. # Most the APIs' will be variations of either getting a single entity or a set of entities. So I will primarily talk about entity and a set of entities in subsequent points. # For getting a set of entities, there will be 3 kinds of filters - filtering on the basis of info, filtering on configs and filtering on metrics. Filtering on the basis of info and field will be based on equality, for instance, fetch entities which have a config name matching a specific config value. Metrics filtering though will be on the basis of relational operator. For instance, user can query entities which have a specific metric >= a certain value. # In addition to that certain predicates such as limit, windowStart, windowEnd, etc. which used to exist in ATSv1 exist even now.Some predicates such as fromId, fromTs may not make sense in ATSv2 but I have included them for now with the intention of discussion. # Additional predicates such as metricswindowStart and end has been specified to fetch metrics data for a specific time span. The reason I included this is because this can aid in plotting graphs on UI for a specific metric of some entity. # Only entity id, type, created and modified time will be returned if fields are not specified in REST URL. This will be the default view of an entity. # Moreover you can also specify which configurations and metrics to return. # Every query param will be received as a String, even timestamp. Now from backing storage implementation viewpoint, would it make more sense to let these query params be passed as strings or do datatype conversion ? Few concerns from Li Lu regarding parameter list becoming too long are quite valid as most of them will be nulls. We can also club multiple related parameters in a different classes to reduce them. Or as he said have different methods for frequently occurring use cases. Thoughts ? Comments are welcome so that this JIRA can speed up, probably after Hadoop Summit :) > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: YARN-3051-YARN-2928.003.patch, > YARN-3051-YARN-2928.03.patch, YARN-3051.wip.02.YARN-2928.patch, > YARN-3051.wip.patch, YARN-3051_temp.patch > > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14580910#comment-14580910 ] Varun Saxena commented on YARN-3051: As of now, there are very similar APIs' for getEntity/getFlowEntity/getUserEntity etc. Will it be fine to combine these APIs' and pass something like a query type(ENTITY/USER/FLOW,etc.) in the API which storage implementation can then use to decide which type of query it is ? > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: YARN-3051-YARN-2928.003.patch, > YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, > YARN-3051.wip.02.YARN-2928.patch, YARN-3051.wip.patch, YARN-3051_temp.patch > > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14580961#comment-14580961 ] Hadoop QA commented on YARN-3051: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | pre-patch | 17m 26s | Findbugs (version ) appears to be broken on YARN-2928. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 3 new or modified test files. | | {color:green}+1{color} | javac | 7m 56s | There were no new javac warning messages. | | {color:red}-1{color} | javadoc | 10m 12s | The applied patch generated 11 additional warning messages. | | {color:green}+1{color} | release audit | 0m 24s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 1m 22s | The applied patch generated 25 new checkstyle issues (total was 243, now 267). | | {color:green}+1{color} | shellcheck | 0m 6s | There were no new shellcheck (v0.3.3) issues. | | {color:green}+1{color} | whitespace | 0m 1s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 40s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 40s | The patch built with eclipse:eclipse. | | {color:red}-1{color} | findbugs | 4m 2s | The patch appears to introduce 5 new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | yarn tests | 0m 22s | Tests passed in hadoop-yarn-api. | | {color:green}+1{color} | yarn tests | 1m 59s | Tests passed in hadoop-yarn-common. | | {color:green}+1{color} | yarn tests | 1m 27s | Tests passed in hadoop-yarn-server-timelineservice. | | | | 48m 2s | | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-yarn-server-timelineservice | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12738884/YARN-3051-YARN-2928.04.patch | | Optional Tests | shellcheck javadoc javac unit findbugs checkstyle | | git revision | YARN-2928 / 0a3c147 | | javadoc | https://builds.apache.org/job/PreCommit-YARN-Build/8234/artifact/patchprocess/diffJavadocWarnings.txt | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/8234/artifact/patchprocess/diffcheckstylehadoop-yarn-api.txt | | Findbugs warnings | https://builds.apache.org/job/PreCommit-YARN-Build/8234/artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-timelineservice.html | | hadoop-yarn-api test log | https://builds.apache.org/job/PreCommit-YARN-Build/8234/artifact/patchprocess/testrun_hadoop-yarn-api.txt | | hadoop-yarn-common test log | https://builds.apache.org/job/PreCommit-YARN-Build/8234/artifact/patchprocess/testrun_hadoop-yarn-common.txt | | hadoop-yarn-server-timelineservice test log | https://builds.apache.org/job/PreCommit-YARN-Build/8234/artifact/patchprocess/testrun_hadoop-yarn-server-timelineservice.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/8234/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/8234/console | This message was automatically generated. > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: YARN-3051-YARN-2928.003.patch, > YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, > YARN-3051.wip.02.YARN-2928.patch, YARN-3051.wip.patch, YARN-3051_temp.patch > > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14583839#comment-14583839 ] Li Lu commented on YARN-3051: - Hi [~varun_saxena], thanks for the update! Some of my quick thoughts for discussion... # I just realized in this JIRA we are creating "backing storage read interface for ATS readers", but not the user facing ATS reader APIs. I believe these two topics are different: in this JIRA we're "wiring up" the storage systems, but in ATS reader APIs, we need to deal with user requirements. This said, I think the main design goal here is to provide a small set of generic interfaces so that we can easily connect them to our writers. We may want to have some brief ideas of the potential user facing features (as [~zjshen] mentioned in a previous comment), but I'm not sure if we need to implement them before we make a concrete design for the storage read interface. # If my understanding in point 1 is right, then perhaps we do not need to quite worry about the huge list of nulls. Of course, on code level we may want to to some cosmetic fixes, but since those interfaces are not user facing, making them more general may be more important I think? # I still think when doing the v2 interface design, it is fine, if not even beneficial, to start from scratch rather than thinking about the existing v1 design. If we're not implementing some v1 features as first-class in v2 storage implementations, maybe we can simply leave them out from the interfaces to storage level? (I assume we'll have an intermediate layer to do the wire up between our user facing reader APIs and the storage interfaces. ) # bq. Now from backing storage implementation viewpoint, would it make more sense to let these query params be passed as strings or do datatype conversion ? I've got no strong preference on this. Leaving them as a generic type (like string) gives the storage layer more freedom to interpret the data, but the readers need to ensure they understand the types by themselves. BTW, could you please briefly skim through the list of Jenkins warnings and see if they're critical? Thanks! > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: YARN-3051-YARN-2928.003.patch, > YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, > YARN-3051.wip.02.YARN-2928.patch, YARN-3051.wip.patch, YARN-3051_temp.patch > > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14583878#comment-14583878 ] Li Lu commented on YARN-3051: - I verified locally that the pre-patch findbugs warnings no longer exists. > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: YARN-3051-YARN-2928.003.patch, > YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, > YARN-3051.wip.02.YARN-2928.patch, YARN-3051.wip.patch, YARN-3051_temp.patch > > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14584148#comment-14584148 ] Zhijie Shen commented on YARN-3051: --- bq. APIs' for querying individual entity/flow/flow run/user and APIs' for querying a set of entities/flow runs/flows/users. APIs' such a set of flows/users will contain aggregated data. The reason for separate endpoints for entities, flows, users,etc. is because of the different tables in HBase/Phoenix schema. I think we don't store the first class citizen entity in a different way and in different tables (Li/Vrushali, correct me If I'm wrong). When fetching an entity, it doesn't matter it is a customized entity or a predefined entity such as ApplicationEntity. In fact, we have two level of interfaces. One is the storage interface and the other is user-oriented interface. I think it's a good idea to let the user-oriented interface to have more specific/advanced APIs to handle the special entity objects, the storage interface could have fewer, more uniformed APIs to reuse the common logic as much as possible. Thoughts? bq. Every query param will be received as a String, even timestamp. Now from backing storage implementation viewpoint, would it make more sense to let these query params be passed as strings or do datatype conversion ? I think we need to take the generic type as the param. If it's transformed to a string, it is likely to be difficult to recover the original type information. For example, when we see a string "true", how do we know whether it used to be a "true" string too or a true boolean. Also, "1234567" is a number or is a string that represents a vehicle license. > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: YARN-3051-YARN-2928.003.patch, > YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, > YARN-3051.wip.02.YARN-2928.patch, YARN-3051.wip.patch, YARN-3051_temp.patch > > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14584197#comment-14584197 ] Li Lu commented on YARN-3051: - bq. APIs' for querying individual entity/flow/flow run/user and APIs' for querying a set of entities/flow runs/flows/users. APIs' such a set of flows/users will contain aggregated data. bq. I think we don't store the first class citizen entity in a different way and in different tables (Li/Vrushali, correct me If I'm wrong). When fetching an entity, it doesn't matter it is a customized entity or a predefined entity such as ApplicationEntity. If we're discussing about storage read interface, why is it harmful to explicitly separate interfaces for raw data and aggregated data, as [~zjshen] proposed before? We can work on the raw data interface first, when designing aggregations. bq. If it's transformed to a string, it is likely to be difficult to recover the original type information. I agree. A follow up concern is, who to maintain, or explain, the type information? I assume we need the readers themselves to keep track of this? > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: YARN-3051-YARN-2928.003.patch, > YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, > YARN-3051.wip.02.YARN-2928.patch, YARN-3051.wip.patch, YARN-3051_temp.patch > > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14585178#comment-14585178 ] Varun Saxena commented on YARN-3051: bq. I think it's a good idea to let the user-oriented interface to have more specific/advanced APIs to handle the special entity objects, the storage interface could have fewer, more uniformed APIs to reuse the common logic as much as possible. Thoughts? After adding a lot of similar APIs' even I am of the same view. A lot more detail can be added in javadoc. This would reduce code bloating. > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: YARN-3051-YARN-2928.003.patch, > YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, > YARN-3051.wip.02.YARN-2928.patch, YARN-3051.wip.patch, YARN-3051_temp.patch > > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14589036#comment-14589036 ] Sangjin Lee commented on YARN-3051: --- Sorry it has taken me a while to chime in on this JIRA. I've just gone over the recent comments, and also skimmed through the latest patch. BTW, the latest patch doesn't seem to apply cleanly (conflicts on {{yarn.cmd}}). [~varun_saxena], could you kindly check the latest patch to see if it needs to be updated? I agree with most of the ideas put forward by folks in the comments. I agree with [~zjshen] that it'd be desirable to have more specific APIs for the user-oriented side of the code and have bit more generic (for lack of a better term) APIs on the side of the storage interaction (namely the {{TimelineReader}} interface in its current form). The goals of the {{TimelineReader}} API is, first, it should be generic/flexible enough to accommodate a wide range of queries being asked, including the current queries as well as possible future queries, and second, it should help the storage implementations translate them into efficient queries onto the storage itself. One idea that may help in this regard is to create further coarse-grained concepts and use them in the {{TimelineReader}} API. It's already doing that to some extent, and we should push that some more. For instance, it might be helpful to create *{{Context}}*. The unique context for most of the queries would involve the cluster id and the app id. So we can make cluster id and the app id part of the {{Context}} object and have {{TimelineReader}} deal with {{Context}} instead of enumerating things like cluster id explicitly in its methods. Similarly, we might want to define *predicates and/or filters*, and use them in the {{TimelineReader}} API. In essence, one way to look at it is that a query onto the storage is really (context) + (predicate/filters) + (contents to retrieve). Then we could consolidate arguments into these coarse-grained things. Also, for the context, I don't think we need to require things like flow id or flow run id. The storage should be able to define the context and locate entities only with cluster id and the app id. > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: YARN-3051-YARN-2928.003.patch, > YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, > YARN-3051.wip.02.YARN-2928.patch, YARN-3051.wip.patch, YARN-3051_temp.patch > > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14590133#comment-14590133 ] Zhijie Shen commented on YARN-3051: --- [~sjlee0], thanks for your chiming in. Varun, Li and I recently have a offline discussion. In general, we agreed on focusing on storage-oriented interface (raw data query) together with a FS implementation of it on this jira, but spinning off change about the user-oriented interface, web front wire up, and single reader daemon setup and dealing with them separately. The rationale is to roll out the reader interface fast, and we can work the HBase/Phoenix implement and web front wireup on a commonly agreed interface in parallel. How do you think about the plan? bq. It's already doing that to some extent, and we should push that some more. For instance, it might be helpful to create Context. Context is useful. Instead of creating a new one, maybe we can reuse the existing Context, which hosts more content than reader needs. So we just need to let reader put/get the required information to/from it. bq. In essence, one way to look at it is that a query onto the storage is really (context) + (predicate/filters) + (contents to retrieve). Then we could consolidate arguments into these coarse-grained things. +1 LGTM, but I think it's for the query of searching a set of qualified entities, right. For fetching a single entity, the query may look like (context) + (entity identifier) + (contents to retrieve) Another issue I want to raise is that after our performance evaluation, we agreed on using HBase for raw data and Phoenix for aggregated data. It implies that we need to use HBase to implement the APIs for the raw entities, while use Phoenix to implement the APIs for the aggregated data. > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: YARN-3051-YARN-2928.003.patch, > YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, > YARN-3051.wip.02.YARN-2928.patch, YARN-3051.wip.patch, YARN-3051_temp.patch > > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14592261#comment-14592261 ] Sangjin Lee commented on YARN-3051: --- {quote} Varun, Li and I recently have a offline discussion. In general, we agreed on focusing on storage-oriented interface (raw data query) together with a FS implementation of it on this jira, but spinning off change about the user-oriented interface, web front wire up, and single reader daemon setup and dealing with them separately. The rationale is to roll out the reader interface fast, and we can work the HBase/Phoenix implement and web front wireup on a commonly agreed interface in parallel. How do you think about the plan? {quote} Agreed with the approach. I would go so far as focusing on the raw data reader part first and get that done and get to the aggregated reader later. Thoughts? {quote} Context is useful. Instead of creating a new one, maybe we can reuse the existing Context, which hosts more content than reader needs. So we just need to let reader put/get the required information to/from it. {quote} It should be fine, as long as it is clear we don't need to fill in all the info for the read path. {quote} +1 LGTM, but I think it's for the query of searching a set of qualified entities, right. For fetching a single entity, the query may look like (context) + (entity identifier) + (contents to retrieve) {quote} Yes, I agree. One can think of the entity id is a special form of a "predicate" still. I'm not married to exactly one API; just the need to use a more coarse-grained approach. {quote} Another issue I want to raise is that after our performance evaluation, we agreed on using HBase for raw data and Phoenix for aggregated data. It implies that we need to use HBase to implement the APIs for the raw entities, while use Phoenix to implement the APIs for the aggregated data. {quote} We discussed this offline. We can have a couple of different approaches for this. We could either have separate reader APIs for raw data and (time-based) aggregated data. Or we could hide the separation behind a facade reader implementation that dispatches calls to a HBase reader impl for raw data and those to a phoenix impl for aggregated data. Either way, it should be pretty straightforward. > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: YARN-3051-YARN-2928.003.patch, > YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, > YARN-3051.wip.02.YARN-2928.patch, YARN-3051.wip.patch, YARN-3051_temp.patch > > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14592536#comment-14592536 ] Zhijie Shen commented on YARN-3051: --- bq. Agreed with the approach. I would go so far as focusing on the raw data reader part first and get that done and get to the aggregated reader later. Thoughts? Exactly. Based on the discussion so far, I've scratched a patch of reader APIs and attached it here. It just contains two methods: one to fetch a single entity and the other to search for a set of entities with given predicates. For the predicate, I start with the common stuff that we have for timeline service v2 data model. Please take a look. Hopefully the folks can be generally satisfactory about the APIs. Then we can start from here, have more iterations to enrich the query semantics and support backward compatibility. Thoughts? > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: YARN-3051-YARN-2928.003.patch, > YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, > YARN-3051.Reader_API.patch, YARN-3051.wip.02.YARN-2928.patch, > YARN-3051.wip.patch, YARN-3051_temp.patch > > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14593029#comment-14593029 ] Varun Saxena commented on YARN-3051: bq. Or we could hide the separation behind a facade reader implementation that dispatches calls to a HBase reader impl for raw data and those to a phoenix impl for aggregated data +1 for this. > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: YARN-3051-YARN-2928.003.patch, > YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, > YARN-3051.Reader_API.patch, YARN-3051.wip.02.YARN-2928.patch, > YARN-3051.wip.patch, YARN-3051_temp.patch > > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14593051#comment-14593051 ] Varun Saxena commented on YARN-3051: [~zjshen], regarding the reader API patch submitted by you, and comparing it with the patches already submitted, some comments : {code} TimelineEntity getEntity( String clusterId, String appId, String entityType, String entityId, EnumSet fieldsToRetrieve) throws IOException; Set getEntities( String clusterId, String appId, Set entityTypes, Long limit, Long createdTimeBegin, Long createdTimeEnd, Long modifiedTimeBegin, Long modifiedTimeEnd, Set relatesTo, Set isRelatedTo, Set info, Set configs, Set events, Set metrics, EnumSet fieldsToRetrieve) throws IOException; {code} * We had decided that user may not need to retrieve all the configs and metrics and hence we should have a parameter to indicate that ? A list of metrics and confs user wants to retrieve ? For both the APIs'. I had included this in the patch I had made. Do we need it ? * Shouldn't we have metrics filters to support queries like fetch entities which have a metric > a certain value. In the patch I had included support for relational operators. * A query use case for having relatesTo and isRelatedTo as filters ? * We do not need flowId and flowRunId to get an entity. But it can still be an optional argument so that we avoid peek into the table which gets them based on cluster and appid. Thoughts ? * Will we fetch entities across entityTypes ? We also have events as filters here. They may not match across entity types. Thoughts ? * As per our previous discussion I had also included metrics time windows in the APIs'. This may aid in plotting graphs for long running apps. Thoughts ? > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: YARN-3051-YARN-2928.003.patch, > YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, > YARN-3051.Reader_API.patch, YARN-3051.wip.02.YARN-2928.patch, > YARN-3051.wip.patch, YARN-3051_temp.patch > > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14593627#comment-14593627 ] Zhijie Shen commented on YARN-3051: --- First of all, I'd like to say it's not the finalized the reader API, but the one we are okay to start with: two types of query, and the set of essential parameters, which focus on tuning what entities to return. We can definitely iterate over the APIs to add more parameters to trim the results, and to control sub-entity information. bq. We had decided that user may not need to retrieve all the configs and metrics and hence we should have a parameter to indicate that ? A list of metrics and confs user wants to retrieve ? For both the APIs'. I had included this in the patch I had made. Do we need it ? Yeah, we could have these parameters, but I'm wondering the efficient way to retrieve part of the configs/metrics in a huge set. For example, if I'm interested in all the mapred configs of my job. What should I do? Enumerate all the mapred configs I want to retrieve in the query parameter is a nightmare. My immediate thought about it is regex, but I don't want to include this parameter into the original version until we're clear about how to specify it. bq. Shouldn't we have metrics filters to support queries like fetch entities which have a metric > a certain value. In the patch I had included support for relational operators. We should. See my TODO comment. The problem again is that it's not a simple predicate. How do we want to abstract and support it? You give the example ">", but we need to take care of "<", "=", "!=", "like" and so on. bq. We do not need flowId and flowRunId to get an entity. But it can still be an optional argument so that we avoid peek into the table which gets them based on cluster and appid. Thoughts ? Yeah, it makes sense to. Image we have the web UI, and user is directed from flow page to the app page and move on, he's going to carry the flow information. If user can provide flowId//flowRunId, we can more efficiently locate the entity. We can have the two params, make them optional. Also, it seems that I've missed userId too. It's the first piece that the consists of the entity key. IMHO, we should have it and make it mandatory to avoid scan through the whole key space. And It should be reasonable that we take the requester as the user and only search into his entity space, but not others. bq. Will we fetch entities across entityTypes ? We also have events as filters here. They may not match across entity types. Thoughts ? Good point, let's go with single entityType first. bq. As per our previous discussion I had also included metrics time windows in the APIs'. This may aid in plotting graphs for long running apps. Thoughts ? This seems to belong to (contents to retrieve), and not difficult to enforce the window. We can add this into the param list. One question is whether we want to specify the window per metric or for all metrics. Personally, I prefer to defer it together with fetching particular configs/metrics in a later enhancement about (contents to retrieve). How do you think? I've updated the Reader interface accordingly. > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: YARN-3051-YARN-2928.003.patch, > YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, > YARN-3051.Reader_API.patch, YARN-3051.Reader_API_1.patch, > YARN-3051.wip.02.YARN-2928.patch, YARN-3051.wip.patch, YARN-3051_temp.patch > > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14593710#comment-14593710 ] Varun Saxena commented on YARN-3051: Thanks [~zjshen] for your comments. bq. Yeah, we could have these parameters, but I'm wondering the efficient way to retrieve part of the configs/metrics in a huge set. Makes sense. We could use a regex or club different configs into different groups and let user query that group. But then the problem will be how do we specify those groups. So as you say lets defer it and discuss it at length when we take it up. bq. You give the example ">", but we need to take care of "<", "=", "!=", "like" and so on. Yes we should support all relational operators. I had implemented it as well in the patch. We can defer this though if we do not envisage having store implementations for this as of now. bq. Personally, I prefer to defer it together with fetching particular configs/metrics in a later enhancement about (contents to retrieve). How do you think? Ok, lets defer it. Overall the proposed store interface in the latest attached file LGTM. I will go ahead and implement it over the weekend if no further comments come. One thing though, along the lines of patch submitted earlier, I can include something like {{Map}} for metrics in the interface for specifying relational operations . It will support things like metricA>val1 and metricA [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: YARN-3051-YARN-2928.003.patch, > YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, > YARN-3051.Reader_API.patch, YARN-3051.Reader_API_1.patch, > YARN-3051.wip.02.YARN-2928.patch, YARN-3051.wip.patch, YARN-3051_temp.patch > > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14593826#comment-14593826 ] Joep Rottinghuis commented on YARN-3051: Not all arguments are equally selective. For example, relatesTo (entities) are not stored in individual cells that can be used as a push down predicate for the HBase tables. We'd have to select all entities that match the other criteria, select the relatesTo string, parse it into individual fields and do set operations on them. {code} Set getEntities(String userId, String clusterId, String flowId, String flowRunId, String appId, String entityType, Long limit, Long createdTimeBegin, Long createdTimeEnd, Long modifiedTimeBegin, Long modifiedTimeEnd, Set relatesTo, Set isRelatedTo, Set info, Set configs, Set events, Set metrics, EnumSet fieldsToRetrieve) throws IOException; } {code} If we defer being able to effectively select a subset of columns, what does it actually mean to specify a Set ? Can the value be null to indicate that we don't care what the value is and that means that we want the column back in the result? I think we should separate out predicates (give me all X where Y=Z) versus selectors (give me all X...). It is not clear in the latest patch if fully populated entities will be returned. Wrt. {quote} Makes sense. We could use a regex or club different configs into different groups and let user query that group. But then the problem will be how do we specify those groups. So as you say lets defer it and discuss it at length when we take it up. {quote} and {quote} One thing though, along the lines of patch submitted earlier, I can include something like Map for metrics in the interface for specifying relational operations . It will support things like metricA>val1 and metricAhttps://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/FilterBase.html) to aggressively reduce what we pull back from HBase. ColumnPrefixFilter for example will be a good way to express which config columns to retrieve. A regex will be a poor way, as that will result in having to pull back every columns, and then dropping values from a retrieved result. Similarly, if our rowkeys are prefixed by users then creating an API that doesn't include the user (only the cluster) means that we're doing a full table scan, albeit with skipfilters that let us skip over users that we're not interested in. In an earlier patch I saw NameValueRelation that was able to perform the operations. That again assumes that all values will be retrieved from the backing store, and then filtered in the reader before returned to the user. It will be more effective to make sure we can easily map this to operations we can push into HBase itself (through a ColumnValueFilter) through the available operations (https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/CompareFilter.CompareOp.html). I'm certainly not arguing to have these HBase specific classes exposed in our API, but our methods should closely match what can be done, which I don't think will be overly restrictive or unreasonable. If we're going to have two types of tables in the backing store: a) HBase native tables, specifically structured for efficient storage and retrieval and b) Phoenix tables (mainly time based aggregates and aggregates over non-primary key prefixes), specifically structured for flexible querying would it make sense to break these two queries into separate families? Or are we thinking that based on what arguments are passed in, we decide which tables to query with which mechanism? > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: YARN-3051-YARN-2928.003.patch, > YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, > YARN-3051.Reader_API.patch, YARN-3051.Reader_API_1.patch, > YARN-3051.wip.02.YARN-2928.patch, YARN-3051.wip.patch, YARN-3051_temp.patch > > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14594029#comment-14594029 ] Zhijie Shen commented on YARN-3051: --- Thanks for chiming in, Jeop! Here's my reply: bq. It is not clear in the latest patch if fully populated entities will be returned. We may not need to worry about it too much. The two APIs are supposed to fetch the raw data. We use user Id + cluster Id + app Id + entity type to efficiently narrow down the scope to search for entities, and limit the number of entities there could be. The following optional parameters will further trim the result set. bq. If we defer being able to effectively select a subset of columns, what does it actually mean to specify a Set ? Can the value be null to indicate that we don't care what the value is and that means that we want the column back in the result? I have update the javadoc to be more specify what the parameters are supposed to do and whether they're mandatory or optional. bq. I'm certainly not arguing to have these HBase specific classes exposed in our API, but our methods should closely match what can be done, which I don't think will be overly restrictive or unreasonable. I think it's a good suggestion. We should double check if we can easily map our customized filters can be easily mapped to some HBase filters. Put a new patch > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: YARN-3051-YARN-2928.003.patch, > YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, > YARN-3051.Reader_API.patch, YARN-3051.Reader_API_1.patch, > YARN-3051.wip.02.YARN-2928.patch, YARN-3051.wip.patch, YARN-3051_temp.patch > > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14594067#comment-14594067 ] Joep Rottinghuis commented on YARN-3051: Thanks [~zjshen] those additional comments in javadoc explain the bigger picture. A few more question that would be good to clarifiy: {code} 120* @return a set of {@link TimelineEntity} instances of the given entity type 121* in the given context scope which matches the given predicates 122* ordered by created time. Each entity will only contain the metadata 123* plus the given fields to retrieve {code} with "matches" presumably you mean an _and_ relationship, all must be true, not _or_ where only one of them need to match correct? l122 The "ordered by creation time." refers to how the optional limit is applied, not that we actually return an ordered set right? {code} 118* @param fieldsToRetrieve 118* the fields to be be returned (optional, by default {@link Field#ALL} 119* will be retrieved) {code} Probably obvious, but once ALL is specified, all fields will be returned, even if only some Fields are specified and others not. > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: YARN-3051-YARN-2928.003.patch, > YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, > YARN-3051.Reader_API.patch, YARN-3051.Reader_API_1.patch, > YARN-3051.Reader_API_2.patch, YARN-3051.wip.02.YARN-2928.patch, > YARN-3051.wip.patch, YARN-3051_temp.patch > > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14594080#comment-14594080 ] Joep Rottinghuis commented on YARN-3051: The same question goes for the items in the Set relatesTo etc. Do the retrieved entities have to have at least one of the related to entities match, or all of them? What if there are more related entities, do we want to retrieve only those with the provided related entities but no more? It sounds like nit-picking, but the implementations would differ quite a bit, so it is good to express what it is that we want to do. Rather than locking in on one interpretation, what if we take a page out of the HBase manual and we could specify that a filter needs to be applied? We can then supply RelatesToFilter, InfoFilter, etc. Filters can be combined with FilterList where you can specify MUST_PASS_ALL, MUST_PASS_ONE (see for example https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/FilterList.html). > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: YARN-3051-YARN-2928.003.patch, > YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, > YARN-3051.Reader_API.patch, YARN-3051.Reader_API_1.patch, > YARN-3051.Reader_API_2.patch, YARN-3051.wip.02.YARN-2928.patch, > YARN-3051.wip.patch, YARN-3051_temp.patch > > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14594098#comment-14594098 ] Zhijie Shen commented on YARN-3051: --- Yeah, I meant to be AND logic for these parameters. I think it's good to be declare explicitly. To extend the parameter, we can add more parameter like AND and OR. I agree it's good to take a look at HBase filter abstraction, and draft ours accordingly. I consider them as the code improvement of filter abstraction, but other than it, hopefully we can agree on using these filters. bq. Probably obvious, but once ALL is specified, all fields will be returned, even if only some Fields are specified and others not. Exactly. > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: YARN-3051-YARN-2928.003.patch, > YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, > YARN-3051.Reader_API.patch, YARN-3051.Reader_API_1.patch, > YARN-3051.Reader_API_2.patch, YARN-3051.wip.02.YARN-2928.patch, > YARN-3051.wip.patch, YARN-3051_temp.patch > > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14594170#comment-14594170 ] Joep Rottinghuis commented on YARN-3051: When discussing with [~sjlee0] we noticed a couple of other items. The isRelatedTo argument takes a set of Set But TimelineEntity.getIsRelatedToEntities() returns a Map> getIsRelatedToEntities(). Presumably these correspond, but the Map at least enforces that each key (the entity type) occurs only once. [~sjlee0] spotted a few other flaws that occur in multiple classes. I'll let him chime in. > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: YARN-3051-YARN-2928.003.patch, > YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, > YARN-3051.Reader_API.patch, YARN-3051.Reader_API_1.patch, > YARN-3051.Reader_API_2.patch, YARN-3051.Reader_API_3.patch, > YARN-3051.wip.02.YARN-2928.patch, YARN-3051.wip.patch, YARN-3051_temp.patch > > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14594180#comment-14594180 ] Sangjin Lee commented on YARN-3051: --- To clarify, I think as a rule it would be good for these arguments match (or follow closely) the types defined in {{TimelineEntity}}. In its current form, if we used {{Set}} it would match *any* relationship. It might be better to qualify the match with the right type of relationship. Thoughts? > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: YARN-3051-YARN-2928.003.patch, > YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, > YARN-3051.Reader_API.patch, YARN-3051.Reader_API_1.patch, > YARN-3051.Reader_API_2.patch, YARN-3051.Reader_API_3.patch, > YARN-3051.wip.02.YARN-2928.patch, YARN-3051.wip.patch, YARN-3051_temp.patch > > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14594192#comment-14594192 ] Sangjin Lee commented on YARN-3051: --- I'd like to discuss {{KeyValuePair}}. We're using {{Set}} for config and info. However, in {{TimelineEntity}}, we have {code} info: HashMap config: HashMap {code} Wouldn't it be better to use these types (i.e. maps v. sets) for info and config instead of using {{KeyValuePair}}? That would also naturally resolve any issues with duplicate keys, etc. The way it stands, since {{KeyValuePair}} does not override {{hashCode()}} or {{equals()}}, {{Set}} would allow entries with duplicate keys. I just think it'd be better to stick with the same types used by {{TimelineEntity}}. BTW, we also noticed that neither {{TimelineEntity}} nor {{TimelineEntity.Identifier}} implements {{equals()}} or {{hashCode()}}. This will be problematic whenever we put them in a collection such as a set. We should define the equality semantics on them and add those methods for them to be used safely in a set or in a map as keys. I'll probably file a separate JIRA on this point. Thoughts? > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: YARN-3051-YARN-2928.003.patch, > YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, > YARN-3051.Reader_API.patch, YARN-3051.Reader_API_1.patch, > YARN-3051.Reader_API_2.patch, YARN-3051.Reader_API_3.patch, > YARN-3051.wip.02.YARN-2928.patch, YARN-3051.wip.patch, YARN-3051_temp.patch > > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14594194#comment-14594194 ] Hadoop QA commented on YARN-3051: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 14m 45s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:red}-1{color} | tests included | 0m 0s | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | | {color:green}+1{color} | javac | 7m 36s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 42s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 22s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 0m 0s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 32s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 32s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 0m 0s | The patch does not introduce any new Findbugs (version ) warnings. | | | | 34m 35s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12740760/YARN-3051.Reader_API_3.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 49f5d20 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/8290/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf907.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/8290/console | This message was automatically generated. > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: YARN-3051-YARN-2928.003.patch, > YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, > YARN-3051.Reader_API.patch, YARN-3051.Reader_API_1.patch, > YARN-3051.Reader_API_2.patch, YARN-3051.Reader_API_3.patch, > YARN-3051.wip.02.YARN-2928.patch, YARN-3051.wip.patch, YARN-3051_temp.patch > > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14594209#comment-14594209 ] Zhijie Shen commented on YARN-3051: --- Good catch! I reverted it back to map. Set is the legacy from the v1 reader API. bq. I'll probably file a separate JIRA on this point. Thoughts? Yeah, please go ahead, not just for entity/identifier, but all data model objects. > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: YARN-3051-YARN-2928.003.patch, > YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, > YARN-3051.Reader_API.patch, YARN-3051.Reader_API_1.patch, > YARN-3051.Reader_API_2.patch, YARN-3051.Reader_API_3.patch, > YARN-3051.Reader_API_4.patch, YARN-3051.wip.02.YARN-2928.patch, > YARN-3051.wip.patch, YARN-3051_temp.patch > > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14594224#comment-14594224 ] Sangjin Lee commented on YARN-3051: --- YARN-3836 added. > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: YARN-3051-YARN-2928.003.patch, > YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, > YARN-3051.Reader_API.patch, YARN-3051.Reader_API_1.patch, > YARN-3051.Reader_API_2.patch, YARN-3051.Reader_API_3.patch, > YARN-3051.Reader_API_4.patch, YARN-3051.wip.02.YARN-2928.patch, > YARN-3051.wip.patch, YARN-3051_temp.patch > > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14594241#comment-14594241 ] Joep Rottinghuis commented on YARN-3051: Can somebody please remind me why TimelineEntity has HashMap for configs and not HashMap as in info? o.a.h.c.Configuration can have things like Boolean, Double, BigDecimal etc. etc. right? We're not retaining that? I think the HBaseWriterImpl has capabilities to simply serialize all these correctly through the GenericObjectMapper. > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: YARN-3051-YARN-2928.003.patch, > YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, > YARN-3051.Reader_API.patch, YARN-3051.Reader_API_1.patch, > YARN-3051.Reader_API_2.patch, YARN-3051.Reader_API_3.patch, > YARN-3051.Reader_API_4.patch, YARN-3051.wip.02.YARN-2928.patch, > YARN-3051.wip.patch, YARN-3051_temp.patch > > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14594243#comment-14594243 ] Sangjin Lee commented on YARN-3051: --- Hadoop's {{Configuration}} is actually (string, string). Typed values are passed on as strings eventually to {{Configuration}}. For example, the iterator for {{Configuration}}: {code} public Iterator> iterator(); {code} > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: YARN-3051-YARN-2928.003.patch, > YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, > YARN-3051.Reader_API.patch, YARN-3051.Reader_API_1.patch, > YARN-3051.Reader_API_2.patch, YARN-3051.Reader_API_3.patch, > YARN-3051.Reader_API_4.patch, YARN-3051.wip.02.YARN-2928.patch, > YARN-3051.wip.patch, YARN-3051_temp.patch > > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14594249#comment-14594249 ] Zhijie Shen commented on YARN-3051: --- And if we process a bulk configs such as loading a config file, it's a bit difficult to assume we know the types of each config upfront. > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: YARN-3051-YARN-2928.003.patch, > YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, > YARN-3051.Reader_API.patch, YARN-3051.Reader_API_1.patch, > YARN-3051.Reader_API_2.patch, YARN-3051.Reader_API_3.patch, > YARN-3051.Reader_API_4.patch, YARN-3051.wip.02.YARN-2928.patch, > YARN-3051.wip.patch, YARN-3051_temp.patch > > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14594410#comment-14594410 ] Zhijie Shen commented on YARN-3051: --- [~varun_saxena], would you please take over the reader API patch and move it forward, i.e., consolidating the comments, implementing FS-based reader, and wireup to web front, build the reader server and so on? > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: YARN-3051-YARN-2928.003.patch, > YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, > YARN-3051.Reader_API.patch, YARN-3051.Reader_API_1.patch, > YARN-3051.Reader_API_2.patch, YARN-3051.Reader_API_3.patch, > YARN-3051.Reader_API_4.patch, YARN-3051.wip.02.YARN-2928.patch, > YARN-3051.wip.patch, YARN-3051_temp.patch > > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14594468#comment-14594468 ] Hadoop QA commented on YARN-3051: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 14m 53s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:red}-1{color} | tests included | 0m 0s | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | | {color:green}+1{color} | javac | 7m 56s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 10m 2s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 23s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 0m 0s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 33s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 0m 0s | The patch does not introduce any new Findbugs (version ) warnings. | | | | 35m 25s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12740776/YARN-3051.Reader_API_4.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 20c03c9 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/8295/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/8295/console | This message was automatically generated. > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: YARN-3051-YARN-2928.003.patch, > YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, > YARN-3051.Reader_API.patch, YARN-3051.Reader_API_1.patch, > YARN-3051.Reader_API_2.patch, YARN-3051.Reader_API_3.patch, > YARN-3051.Reader_API_4.patch, YARN-3051.wip.02.YARN-2928.patch, > YARN-3051.wip.patch, YARN-3051_temp.patch > > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14604932#comment-14604932 ] Hadoop QA commented on YARN-3051: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | pre-patch | 16m 56s | Findbugs (version ) appears to be broken on YARN-2928. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 7m 42s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 44s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 23s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 1m 23s | The applied patch generated 3 new checkstyle issues (total was 234, now 236). | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 38s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 40s | The patch built with eclipse:eclipse. | | {color:red}-1{color} | findbugs | 3m 56s | The patch appears to introduce 1 new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | yarn tests | 0m 25s | Tests passed in hadoop-yarn-api. | | {color:green}+1{color} | yarn tests | 1m 57s | Tests passed in hadoop-yarn-common. | | {color:green}+1{color} | yarn tests | 1m 18s | Tests passed in hadoop-yarn-server-timelineservice. | | | | 46m 29s | | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-yarn-server-timelineservice | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12742423/YARN-3051-YARN-2928.05.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | YARN-2928 / 84f37f1 | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/8370/artifact/patchprocess/diffcheckstylehadoop-yarn-api.txt | | Findbugs warnings | https://builds.apache.org/job/PreCommit-YARN-Build/8370/artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-timelineservice.html | | hadoop-yarn-api test log | https://builds.apache.org/job/PreCommit-YARN-Build/8370/artifact/patchprocess/testrun_hadoop-yarn-api.txt | | hadoop-yarn-common test log | https://builds.apache.org/job/PreCommit-YARN-Build/8370/artifact/patchprocess/testrun_hadoop-yarn-common.txt | | hadoop-yarn-server-timelineservice test log | https://builds.apache.org/job/PreCommit-YARN-Build/8370/artifact/patchprocess/testrun_hadoop-yarn-server-timelineservice.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/8370/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/8370/console | This message was automatically generated. > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: YARN-3051-YARN-2928.003.patch, > YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, > YARN-3051-YARN-2928.05.patch, YARN-3051.Reader_API.patch, > YARN-3051.Reader_API_1.patch, YARN-3051.Reader_API_2.patch, > YARN-3051.Reader_API_3.patch, YARN-3051.Reader_API_4.patch, > YARN-3051.wip.02.YARN-2928.patch, YARN-3051.wip.patch, YARN-3051_temp.patch > > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14605376#comment-14605376 ] Varun Saxena commented on YARN-3051: [~jrottinghuis], bq. In an earlier patch I saw NameValueRelation that was able to perform the operations. That again assumes that all values will be retrieved from the backing store, and then filtered in the reader before returned to the user. It will be more effective to make sure we can easily map this to operations we can push into HBase itself (through a ColumnValueFilter) through the available operations NameValueRelation would be used in metrics filters and will specify the metric name and relation to its value. The {{match}} function in it is not necessary to be used by the store implementation. It was added for use by FS based implementation. {{RelationOp}} for instance, although wasn't intentional, directly maps to {{CompareFilter.CompareOp}}. So should not be too difficult to convert it into a HBase Filter by the backend implementation. Currently only AND operations are supported. For support of OR operations we will handle it as part of other JIRAs'. > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: YARN-3051-YARN-2928.003.patch, > YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, > YARN-3051-YARN-2928.05.patch, YARN-3051.Reader_API.patch, > YARN-3051.Reader_API_1.patch, YARN-3051.Reader_API_2.patch, > YARN-3051.Reader_API_3.patch, YARN-3051.Reader_API_4.patch, > YARN-3051.wip.02.YARN-2928.patch, YARN-3051.wip.patch, YARN-3051_temp.patch > > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14606458#comment-14606458 ] Li Lu commented on YARN-3051: - Hi [~varun_saxena], thanks for the patch! I think this version is much closer and we're getting a reader-storage interface soon. The general approach looks fine, but I have one concern. I was looking at it when I found there are a lot of code related to the detailed filter design, like some binary relational operators. I'm not sure if in this JIRA we need to fix those filter designs, or we simply want to have some basic, name based filters, like "filtering out entities with metric HDFS_BYTES_WRITE". For detailed filter designs, we may need to consider our storage level implementations like our HBase implementation. After a general skim through the rest part of the patch I think they're fine, and I'll post detailed review of the rest part of code soon. > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: YARN-3051-YARN-2928.003.patch, > YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, > YARN-3051-YARN-2928.05.patch, YARN-3051.Reader_API.patch, > YARN-3051.Reader_API_1.patch, YARN-3051.Reader_API_2.patch, > YARN-3051.Reader_API_3.patch, YARN-3051.Reader_API_4.patch, > YARN-3051.wip.02.YARN-2928.patch, YARN-3051.wip.patch, YARN-3051_temp.patch > > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14606466#comment-14606466 ] Varun Saxena commented on YARN-3051: [~gtCarrera9], I included the relational operators because I had written it already. I had although raised another JIRA for filters. We probably need to provide support for OR (not only AND) operator as well. If you want I can move this relational operator part out of this JIRA and put it in there. And have simple metric filter(based on metric name). > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: YARN-3051-YARN-2928.003.patch, > YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, > YARN-3051-YARN-2928.05.patch, YARN-3051.Reader_API.patch, > YARN-3051.Reader_API_1.patch, YARN-3051.Reader_API_2.patch, > YARN-3051.Reader_API_3.patch, YARN-3051.Reader_API_4.patch, > YARN-3051.wip.02.YARN-2928.patch, YARN-3051.wip.patch, YARN-3051_temp.patch > > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14606469#comment-14606469 ] Varun Saxena commented on YARN-3051: To be precise, YARN-3863 is meant for that. To make filters as close as possible to backend storage implementation(based on HBase Filters). > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: YARN-3051-YARN-2928.003.patch, > YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, > YARN-3051-YARN-2928.05.patch, YARN-3051.Reader_API.patch, > YARN-3051.Reader_API_1.patch, YARN-3051.Reader_API_2.patch, > YARN-3051.Reader_API_3.patch, YARN-3051.Reader_API_4.patch, > YARN-3051.wip.02.YARN-2928.patch, YARN-3051.wip.patch, YARN-3051_temp.patch > > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14606510#comment-14606510 ] Li Lu commented on YARN-3051: - OK, linked all related JIRAs to this one. Feel free to add more. > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: YARN-3051-YARN-2928.003.patch, > YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, > YARN-3051-YARN-2928.05.patch, YARN-3051.Reader_API.patch, > YARN-3051.Reader_API_1.patch, YARN-3051.Reader_API_2.patch, > YARN-3051.Reader_API_3.patch, YARN-3051.Reader_API_4.patch, > YARN-3051.wip.02.YARN-2928.patch, YARN-3051.wip.patch, YARN-3051_temp.patch > > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14606542#comment-14606542 ] Zhijie Shen commented on YARN-3051: --- Varun, thanks for take over the reader interface patch. I noticed there's a param difference: Map metricFilters. I'd like to recommend we don't support the binary relationship in the initial jira. My intial param means to filter the entities who contain the given metrics. We can file a separate jira to add binary relationship filtering, for metrics, config and info altogether. How do you think? Here's a couple of comments about the patch: 1. Why do we need JsonSetter for the data model objects? 2. Can we prevent introducing the test oriented configurations into YarnConfiguration, which is part of api? 3. Is "getTimelineRecordFromJSON" required to be exposed. I'm a bit conservative to put the methods in api/common, which mean we need to keep supporting it. 4. Maybe {{Field}} is better to be the inner class of TimelineReader or TimelineEntity. Otherwise, the name a bit vague about what it represents. 5. It seems to be better to implement FileSystemTimelineStorageImpl that implements both TimelineReader and TimelineWriter. One motivation is to reuse some code. A more critical problem is that FS reader and writer are not integrated: a) Writer should not have written the mapping into APP_FLOW_MAPPING_FILE. b) Currently, when updating an entity, a new entity json will be appended into the same file, but this reader impl assumes one entity per file. We need to sync the behavior between them. > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: YARN-3051-YARN-2928.003.patch, > YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, > YARN-3051-YARN-2928.05.patch, YARN-3051.Reader_API.patch, > YARN-3051.Reader_API_1.patch, YARN-3051.Reader_API_2.patch, > YARN-3051.Reader_API_3.patch, YARN-3051.Reader_API_4.patch, > YARN-3051.wip.02.YARN-2928.patch, YARN-3051.wip.patch, YARN-3051_temp.patch > > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14606577#comment-14606577 ] Varun Saxena commented on YARN-3051: [~zjshen], thanks for looking at the patch. bq. I'd like to recommend we don't support the binary relationship in the initial jira. My intial param means to filter the entities who contain the given metrics OK. will make the change. Let's move out this code to YARN-3863 One more thing I have changed is limit of entities by default have been kept has 100 instead of 1000. 1000 seemed too many. Thoughts ? bq. Why do we need JsonSetter for the data model objects? While reading back JSON dump from file, this is required. bq. Maybe Field is better to be the inner class of TimelineReader or TimelineEntity. Otherwise, the name a bit vague about what it represents. Ok. bq. A more critical problem is that FS reader and writer are not integrated: Was thinking of raising a new JIRA to integrate writer and reader implementations because its not directly related to Reader API JIRA. Will do so. bq. It seems to be better to implement FileSystemTimelineStorageImpl that implements both TimelineReader and TimelineWriter. Thats a good suggestion. bq. Currently, when updating an entity, a new entity json will be appended into the same file, but this reader impl assumes one entity per file Ok, so the last entity entry should be the one returned ? Will check the writer side code and ask if any queries. While combining both FS writer and reader, we can decide the best possible option. bq. Is "getTimelineRecordFromJSON" required to be exposed. Added in TimelineUtils because dumpTimelineRecordtoJSON used by FS Writer was also put in the same class. And the ObjectMapper used by both the methods is also initialized in that class. If we combine FS Writer and Reader into one class, probably can move both methods into that class. Isn't likely to be used outside FS implementation. > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: YARN-3051-YARN-2928.003.patch, > YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, > YARN-3051-YARN-2928.05.patch, YARN-3051.Reader_API.patch, > YARN-3051.Reader_API_1.patch, YARN-3051.Reader_API_2.patch, > YARN-3051.Reader_API_3.patch, YARN-3051.Reader_API_4.patch, > YARN-3051.wip.02.YARN-2928.patch, YARN-3051.wip.patch, YARN-3051_temp.patch > > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14608774#comment-14608774 ] Zhijie Shen commented on YARN-3051: --- bq. One more thing I have changed is limit of entities by default have been kept has 100 instead of 1000. 1000 seemed too many. Thoughts ? Sure, just noticed that the previous limit is 100 too. bq. Ok, so the last entity entry should be the one returned ? It's not that straightforward. For example, I can put entity 1 twice: one contains event 1 and the other contains event 2. In fact, when I want to retrieve the entity 1 with event field included. I actually want to have both events. I can see two choices: one is to merge the entity data at the write path and the other at the read path. bq. Added in TimelineUtils because dumpTimelineRecordtoJSON used by FS Writer was also put in the same class. That method is used by downstream project (e.g., tez) to logging/debugging the ATS integration. And this all getters of the data model objects are annotated. The method is applicable to all these objects. On the other side, we only annotate "jasonsetter" for TimelineEntity, such that getTimelineRecordFromJSON is not generalized enough for all purpose, but for FS impl only now. Maybe we can hold back the method and promote it to public api once we see real use case of it. Thoughts? > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: YARN-3051-YARN-2928.003.patch, > YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, > YARN-3051-YARN-2928.05.patch, YARN-3051.Reader_API.patch, > YARN-3051.Reader_API_1.patch, YARN-3051.Reader_API_2.patch, > YARN-3051.Reader_API_3.patch, YARN-3051.Reader_API_4.patch, > YARN-3051.wip.02.YARN-2928.patch, YARN-3051.wip.patch, YARN-3051_temp.patch > > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14609172#comment-14609172 ] Varun Saxena commented on YARN-3051: bq. Maybe we can hold back the method and promote it to public api once we see real use case of it. Ok will move it back to FS Reader. > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: YARN-3051-YARN-2928.003.patch, > YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, > YARN-3051-YARN-2928.05.patch, YARN-3051.Reader_API.patch, > YARN-3051.Reader_API_1.patch, YARN-3051.Reader_API_2.patch, > YARN-3051.Reader_API_3.patch, YARN-3051.Reader_API_4.patch, > YARN-3051.wip.02.YARN-2928.patch, YARN-3051.wip.patch, YARN-3051_temp.patch > > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14609177#comment-14609177 ] Varun Saxena commented on YARN-3051: bq. It's not that straightforward. Hmm. So its basically a union of everything. Will handle it. As of now on the read path. > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: YARN-3051-YARN-2928.003.patch, > YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, > YARN-3051-YARN-2928.05.patch, YARN-3051.Reader_API.patch, > YARN-3051.Reader_API_1.patch, YARN-3051.Reader_API_2.patch, > YARN-3051.Reader_API_3.patch, YARN-3051.Reader_API_4.patch, > YARN-3051.wip.02.YARN-2928.patch, YARN-3051.wip.patch, YARN-3051_temp.patch > > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14609205#comment-14609205 ] Sangjin Lee commented on YARN-3051: --- bq. It's not that straightforward. For example, I can put entity 1 twice: one contains event 1 and the other contains event 2. In fact, when I want to retrieve the entity 1 with event field included. I actually want to have both events. I can see two choices: one is to merge the entity data at the write path and the other at the read path. I think it would be easier (for the filesystem writer/reader) to do this on the read path. > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: YARN-3051-YARN-2928.003.patch, > YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, > YARN-3051-YARN-2928.05.patch, YARN-3051.Reader_API.patch, > YARN-3051.Reader_API_1.patch, YARN-3051.Reader_API_2.patch, > YARN-3051.Reader_API_3.patch, YARN-3051.Reader_API_4.patch, > YARN-3051.wip.02.YARN-2928.patch, YARN-3051.wip.patch, YARN-3051_temp.patch > > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14610626#comment-14610626 ] Varun Saxena commented on YARN-3051: Any reason metrics and events in TimelineEntity are stored in a set ? A map will make some operations easier and optimal in case of FS implementation > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: YARN-3051-YARN-2928.003.patch, > YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, > YARN-3051-YARN-2928.05.patch, YARN-3051-YARN-2928.06.patch, > YARN-3051.Reader_API.patch, YARN-3051.Reader_API_1.patch, > YARN-3051.Reader_API_2.patch, YARN-3051.Reader_API_3.patch, > YARN-3051.Reader_API_4.patch, YARN-3051.wip.02.YARN-2928.patch, > YARN-3051.wip.patch, YARN-3051_temp.patch > > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14610700#comment-14610700 ] Hadoop QA commented on YARN-3051: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | pre-patch | 16m 37s | Findbugs (version ) appears to be broken on YARN-2928. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 7m 47s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 43s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 22s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 1m 6s | The applied patch generated 3 new checkstyle issues (total was 234, now 236). | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 35s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 37s | The patch built with eclipse:eclipse. | | {color:red}-1{color} | findbugs | 2m 24s | The patch appears to introduce 1 new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | yarn tests | 0m 25s | Tests passed in hadoop-yarn-api. | | {color:red}-1{color} | yarn tests | 7m 52s | Tests failed in hadoop-yarn-server-timelineservice. | | | | 48m 43s | | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-yarn-server-timelineservice | | Failed unit tests | hadoop.yarn.server.timelineservice.storage.TestHBaseTimelineWriterImpl | | | hadoop.yarn.server.timelineservice.storage.TestPhoenixTimelineWriterImpl | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12743118/YARN-3051-YARN-2928.06.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | YARN-2928 / 18c4859 | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/8409/artifact/patchprocess/diffcheckstylehadoop-yarn-api.txt | | Findbugs warnings | https://builds.apache.org/job/PreCommit-YARN-Build/8409/artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-timelineservice.html | | hadoop-yarn-api test log | https://builds.apache.org/job/PreCommit-YARN-Build/8409/artifact/patchprocess/testrun_hadoop-yarn-api.txt | | hadoop-yarn-server-timelineservice test log | https://builds.apache.org/job/PreCommit-YARN-Build/8409/artifact/patchprocess/testrun_hadoop-yarn-server-timelineservice.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/8409/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf901.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/8409/console | This message was automatically generated. > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: YARN-3051-YARN-2928.003.patch, > YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, > YARN-3051-YARN-2928.05.patch, YARN-3051-YARN-2928.06.patch, > YARN-3051.Reader_API.patch, YARN-3051.Reader_API_1.patch, > YARN-3051.Reader_API_2.patch, YARN-3051.Reader_API_3.patch, > YARN-3051.Reader_API_4.patch, YARN-3051.wip.02.YARN-2928.patch, > YARN-3051.wip.patch, YARN-3051_temp.patch > > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14611156#comment-14611156 ] Zhijie Shen commented on YARN-3051: --- [~varun_saxena], thanks for being patient about the comments. Here're some more about about the new patch. 1. We should compare the objects directly instead of converting them to String first. {code} 136 private static boolean matchFilter(Object infoValue, Object filterValue) { 137 // Convert to String and check for now. 138 return infoValue.toString().equals(filterValue.toString()); 139 } {code} 2. No one is writing the mapping into APP_FLOW_MAPPING_FILE in the current code base? Are you suggesting treating it as a property file? What's the rationale? How about using CSV format: 1) searching for user/flowId/flowRunId separately 2) being neutral about path separator. {code} 59 prop.setProperty("app1", "user1/flow1/1"); {code} 3. Can we prevent introducing the test oriented configurations into YarnConfiguration? 4. We can do some optimization for the file implementation, such putting created time and modified time into file name to quickly filter these files without reading them, merging the entities and overwriting the file to prevent merging again for each query. But that's not critical here, we can do it later. > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: YARN-3051-YARN-2928.003.patch, > YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, > YARN-3051-YARN-2928.05.patch, YARN-3051-YARN-2928.06.patch, > YARN-3051.Reader_API.patch, YARN-3051.Reader_API_1.patch, > YARN-3051.Reader_API_2.patch, YARN-3051.Reader_API_3.patch, > YARN-3051.Reader_API_4.patch, YARN-3051.wip.02.YARN-2928.patch, > YARN-3051.wip.patch, YARN-3051_temp.patch > > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14611196#comment-14611196 ] Varun Saxena commented on YARN-3051: Thanks for the review [~zjshen]. bq. 1. We should compare the objects directly instead of converting them to String first. Correct. bq. 2. No one is writing the mapping into APP_FLOW_MAPPING_FILE in the current code base? Yes nobody had handled mapping between app to flow in writer code path. So came up with this solution. We can write CSV as well. Will write the code for writer as well when I combine the FS Writer and reader classes. Regarding, searching for user/flowId/flowRunId separately, you mean store them in separate files ? bq. 3. Can we prevent introducing the test oriented configurations into YarnConfiguration? You mean the config about fs storage root directory (TIMELINE_SERVICE_STORAGE_DIR_ROOT) ? This is used by writer as well and is expected to be a configuration hence moved it to YarnConfiguration. We do not want it as a configuration ? The config name I plan to change but that would require change in writer too. bq. We can do some optimization for the file implementation, such putting created time and modified time into file name to quickly filter these files without reading them, merging the entities and overwriting the file to prevent merging again for each query. But that's not critical here, we can do it later. These are good suggestions. First one even I had thought but then getting in a working patch took priority. Anyways will handle when I merge FS reader and writer. > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: YARN-3051-YARN-2928.003.patch, > YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, > YARN-3051-YARN-2928.05.patch, YARN-3051-YARN-2928.06.patch, > YARN-3051.Reader_API.patch, YARN-3051.Reader_API_1.patch, > YARN-3051.Reader_API_2.patch, YARN-3051.Reader_API_3.patch, > YARN-3051.Reader_API_4.patch, YARN-3051.wip.02.YARN-2928.patch, > YARN-3051.wip.patch, YARN-3051_temp.patch > > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14612113#comment-14612113 ] Zhijie Shen commented on YARN-3051: --- 2. I meant we store in a CSV file. Thoughts? 3. I think FS impl related config shouldn't be put in api as the impl not supposed to be used by public, but for test purpose. > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: YARN-3051-YARN-2928.003.patch, > YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, > YARN-3051-YARN-2928.05.patch, YARN-3051-YARN-2928.06.patch, > YARN-3051.Reader_API.patch, YARN-3051.Reader_API_1.patch, > YARN-3051.Reader_API_2.patch, YARN-3051.Reader_API_3.patch, > YARN-3051.Reader_API_4.patch, YARN-3051.wip.02.YARN-2928.patch, > YARN-3051.wip.patch, YARN-3051_temp.patch > > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14612117#comment-14612117 ] Varun Saxena commented on YARN-3051: Ok...Will make the change > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: YARN-3051-YARN-2928.003.patch, > YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, > YARN-3051-YARN-2928.05.patch, YARN-3051-YARN-2928.06.patch, > YARN-3051.Reader_API.patch, YARN-3051.Reader_API_1.patch, > YARN-3051.Reader_API_2.patch, YARN-3051.Reader_API_3.patch, > YARN-3051.Reader_API_4.patch, YARN-3051.wip.02.YARN-2928.patch, > YARN-3051.wip.patch, YARN-3051_temp.patch > > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14615126#comment-14615126 ] Hadoop QA commented on YARN-3051: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | pre-patch | 17m 22s | Findbugs (version ) appears to be broken on YARN-2928. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 8m 10s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 10m 11s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 23s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 1m 18s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 39s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 41s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 2m 23s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | yarn tests | 0m 24s | Tests passed in hadoop-yarn-api. | | {color:green}+1{color} | yarn tests | 1m 19s | Tests passed in hadoop-yarn-server-timelineservice. | | | | 43m 56s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12743727/YARN-3051-YARN-2928.07.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | YARN-2928 / 18c4859 | | hadoop-yarn-api test log | https://builds.apache.org/job/PreCommit-YARN-Build/8439/artifact/patchprocess/testrun_hadoop-yarn-api.txt | | hadoop-yarn-server-timelineservice test log | https://builds.apache.org/job/PreCommit-YARN-Build/8439/artifact/patchprocess/testrun_hadoop-yarn-server-timelineservice.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/8439/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/8439/console | This message was automatically generated. > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: YARN-3051-YARN-2928.003.patch, > YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, > YARN-3051-YARN-2928.05.patch, YARN-3051-YARN-2928.06.patch, > YARN-3051-YARN-2928.07.patch, YARN-3051.Reader_API.patch, > YARN-3051.Reader_API_1.patch, YARN-3051.Reader_API_2.patch, > YARN-3051.Reader_API_3.patch, YARN-3051.Reader_API_4.patch, > YARN-3051.wip.02.YARN-2928.patch, YARN-3051.wip.patch, YARN-3051_temp.patch > > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14615295#comment-14615295 ] Varun Saxena commented on YARN-3051: [~zjshen], [~sjlee0], kindly review. > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: YARN-3051-YARN-2928.003.patch, > YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, > YARN-3051-YARN-2928.05.patch, YARN-3051-YARN-2928.06.patch, > YARN-3051-YARN-2928.07.patch, YARN-3051.Reader_API.patch, > YARN-3051.Reader_API_1.patch, YARN-3051.Reader_API_2.patch, > YARN-3051.Reader_API_3.patch, YARN-3051.Reader_API_4.patch, > YARN-3051.wip.02.YARN-2928.patch, YARN-3051.wip.patch, YARN-3051_temp.patch > > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14615354#comment-14615354 ] Sangjin Lee commented on YARN-3051: --- Thanks [~varun_saxena] for providing a quick update! My latest comments are mostly on FileSystemTimelineReaderImpl.java. - l.151-152: [~zjshen] previously pointed this out but I don't see this changed in the latest patch. Do info values have to be converted into strings to be compared for equality? Is it because you worry about the info value types not implementing equals()? Can we not assume that it is expected for the info value types to provide sensible equals() implementations? - l.192: How do you deal with a situation where "," is used in the tokens themselves? Note that flow names may contain commas (there is no reason they cannot). The separators should be escaped on the way in and unescaped on the way out. And it'd be good to have some unit tests for this case. - l.220: matchMetricFilters() is static while matchEventFilters() is not. Could you make it consistent across all private helper methods? - l.249: nit: it can be a simple return statement instead of the if clause. > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: YARN-3051-YARN-2928.003.patch, > YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, > YARN-3051-YARN-2928.05.patch, YARN-3051-YARN-2928.06.patch, > YARN-3051-YARN-2928.07.patch, YARN-3051.Reader_API.patch, > YARN-3051.Reader_API_1.patch, YARN-3051.Reader_API_2.patch, > YARN-3051.Reader_API_3.patch, YARN-3051.Reader_API_4.patch, > YARN-3051.wip.02.YARN-2928.patch, YARN-3051.wip.patch, YARN-3051_temp.patch > > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14615363#comment-14615363 ] Varun Saxena commented on YARN-3051: bq. Do info values have to be converted into strings to be compared for equality? Sorry missed this change. bq. How do you deal with a situation where "," is used in the tokens themselves? Wasn't expecting commas in flow. Will handle it. bq. matchMetricFilters() is static while matchEventFilters() is not. Could you make it consistent across all private helper methods? Ok. Missed it. bq. it can be a simple return statement instead of the if clause. Ok > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: YARN-3051-YARN-2928.003.patch, > YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, > YARN-3051-YARN-2928.05.patch, YARN-3051-YARN-2928.06.patch, > YARN-3051-YARN-2928.07.patch, YARN-3051.Reader_API.patch, > YARN-3051.Reader_API_1.patch, YARN-3051.Reader_API_2.patch, > YARN-3051.Reader_API_3.patch, YARN-3051.Reader_API_4.patch, > YARN-3051.wip.02.YARN-2928.patch, YARN-3051.wip.patch, YARN-3051_temp.patch > > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14615387#comment-14615387 ] Zhijie Shen commented on YARN-3051: --- How about we using common csv lib to handle the lookup file? http://commons.apache.org/proper/commons-csv/index.html > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: YARN-3051-YARN-2928.003.patch, > YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, > YARN-3051-YARN-2928.05.patch, YARN-3051-YARN-2928.06.patch, > YARN-3051-YARN-2928.07.patch, YARN-3051.Reader_API.patch, > YARN-3051.Reader_API_1.patch, YARN-3051.Reader_API_2.patch, > YARN-3051.Reader_API_3.patch, YARN-3051.Reader_API_4.patch, > YARN-3051.wip.02.YARN-2928.patch, YARN-3051.wip.patch, YARN-3051_temp.patch > > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14615390#comment-14615390 ] Varun Saxena commented on YARN-3051: Oh we have an Apache Lib for it. Will use it. Thanks [~zjshen] > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: YARN-3051-YARN-2928.003.patch, > YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, > YARN-3051-YARN-2928.05.patch, YARN-3051-YARN-2928.06.patch, > YARN-3051-YARN-2928.07.patch, YARN-3051.Reader_API.patch, > YARN-3051.Reader_API_1.patch, YARN-3051.Reader_API_2.patch, > YARN-3051.Reader_API_3.patch, YARN-3051.Reader_API_4.patch, > YARN-3051.wip.02.YARN-2928.patch, YARN-3051.wip.patch, YARN-3051_temp.patch > > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14615741#comment-14615741 ] Hadoop QA commented on YARN-3051: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | pre-patch | 18m 30s | Findbugs (version ) appears to be broken on YARN-2928. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 8m 35s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 10m 33s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 24s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 1m 22s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 48s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 46s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 2m 31s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | yarn tests | 0m 25s | Tests passed in hadoop-yarn-api. | | {color:green}+1{color} | yarn tests | 1m 21s | Tests passed in hadoop-yarn-server-timelineservice. | | | | 46m 23s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12743797/YARN-3051-YARN-2928.08.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | YARN-2928 / 6837552 | | hadoop-yarn-api test log | https://builds.apache.org/job/PreCommit-YARN-Build/8443/artifact/patchprocess/testrun_hadoop-yarn-api.txt | | hadoop-yarn-server-timelineservice test log | https://builds.apache.org/job/PreCommit-YARN-Build/8443/artifact/patchprocess/testrun_hadoop-yarn-server-timelineservice.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/8443/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf902.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/8443/console | This message was automatically generated. > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: YARN-3051-YARN-2928.003.patch, > YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, > YARN-3051-YARN-2928.05.patch, YARN-3051-YARN-2928.06.patch, > YARN-3051-YARN-2928.07.patch, YARN-3051-YARN-2928.08.patch, > YARN-3051.Reader_API.patch, YARN-3051.Reader_API_1.patch, > YARN-3051.Reader_API_2.patch, YARN-3051.Reader_API_3.patch, > YARN-3051.Reader_API_4.patch, YARN-3051.wip.02.YARN-2928.patch, > YARN-3051.wip.patch, YARN-3051_temp.patch > > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14615747#comment-14615747 ] Zhijie Shen commented on YARN-3051: --- Hi Varun, thanks for updating the patch. I have only one remaining issue about this patch: According to https://issues.apache.org/jira/secure/attachment/12743391/hbase-schema-proposal-for-aggregation.pdf. It seems that we have chosen clusterId + appId to globally find a unique flow run. I think here we should do it similar by adding clusterId, which 's mandatory field. /cc [~sjlee0]. Some other improvement that are required in the future to improve robustness and performance. Let's make sure we have a jira to improve the reader later. 1. Maybe we want to cache the mapping instead of reading it from the file for every query. 2. limit should be push down into the for loop. It's unnecessary that if we want to just retrieve 10 entities, we will have to go through 1000 qualified candidates and finally pick the top 10. 3. We'd better avoid hard code "/" as the path separator, and we should use FileSystem interface to operate the files, such that the impl can also work with HDFS. > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: YARN-3051-YARN-2928.003.patch, > YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, > YARN-3051-YARN-2928.05.patch, YARN-3051-YARN-2928.06.patch, > YARN-3051-YARN-2928.07.patch, YARN-3051-YARN-2928.08.patch, > YARN-3051.Reader_API.patch, YARN-3051.Reader_API_1.patch, > YARN-3051.Reader_API_2.patch, YARN-3051.Reader_API_3.patch, > YARN-3051.Reader_API_4.patch, YARN-3051.wip.02.YARN-2928.patch, > YARN-3051.wip.patch, YARN-3051_temp.patch > > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14615776#comment-14615776 ] Varun Saxena commented on YARN-3051: [~zjshen], bq. we have chosen clusterId + appId to globally find a unique flow run. I think here we should do it similar by adding clusterId The current FS implementation had cluster as part of the path. So there will a app_flow_mapping.csv for each cluster. So in a way it is part of the primary key even though its not there in app_flow_mapping.csv I hope that is what your concern was. bq. 1. Maybe we want to cache the mapping instead of reading it from the file for every query. Yes, we should be doing so. Plan to do these optimizations in later JIRA. Also some optimizations are required as in we are using set instead of map for storing metrics and events. So I have to iterate over all of them. Any issue in turning them into map ? bq. 2. limit should be push down into the for loop. It's unnecessary that if we want to just retrieve. The issue here is that we want to have limit on entities but these should be latest entities(sorted descendingly by created time). Having created time in entity file name will help towards not reading all the files. bq.3. We'd better avoid hard code "/" as the path separator, and we should use FileSystem interface to operate the files, such that the impl can also work with HDFS. Ok. > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: YARN-3051-YARN-2928.003.patch, > YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, > YARN-3051-YARN-2928.05.patch, YARN-3051-YARN-2928.06.patch, > YARN-3051-YARN-2928.07.patch, YARN-3051-YARN-2928.08.patch, > YARN-3051.Reader_API.patch, YARN-3051.Reader_API_1.patch, > YARN-3051.Reader_API_2.patch, YARN-3051.Reader_API_3.patch, > YARN-3051.Reader_API_4.patch, YARN-3051.wip.02.YARN-2928.patch, > YARN-3051.wip.patch, YARN-3051_temp.patch > > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14615792#comment-14615792 ] Zhijie Shen commented on YARN-3051: --- bq. The current FS implementation had cluster as part of the path. So there will a app_flow_mapping.csv for each cluster. So in a way it is part of the primary key even though its not there in app_flow_mapping.csv I hope that is what your concern was. The problem is about write path. Suppose we unfortunately have the duplicate appId: one is clusterId1/appId and the other is clusterId2/appId. When the former entity is written, you have added appId into the mapping file. How do you write the mapping file upon cluster2/appId? Overwriting the row of appId? Appending one more row of appId? Both will trouble you when finding the right flow info when the query has default values. > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: YARN-3051-YARN-2928.003.patch, > YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, > YARN-3051-YARN-2928.05.patch, YARN-3051-YARN-2928.06.patch, > YARN-3051-YARN-2928.07.patch, YARN-3051-YARN-2928.08.patch, > YARN-3051.Reader_API.patch, YARN-3051.Reader_API_1.patch, > YARN-3051.Reader_API_2.patch, YARN-3051.Reader_API_3.patch, > YARN-3051.Reader_API_4.patch, YARN-3051.wip.02.YARN-2928.patch, > YARN-3051.wip.patch, YARN-3051_temp.patch > > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14615818#comment-14615818 ] Varun Saxena commented on YARN-3051: bq. Overwriting the row of appId? Appending one more row of appId? No. cluster1 will have a different directory and cluster2 a different one. I mean if default root directory is {{/tmp/timeline_service_data}} and 2 cluster ids', we will have one app flow mapping file at location {{/tmp/timeline_service_data/cluster1/app_flow_mapping.csv}} and other one will be {{/tmp/timeline_service_data/cluster2/app_flow_mapping.csv}} > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: YARN-3051-YARN-2928.003.patch, > YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, > YARN-3051-YARN-2928.05.patch, YARN-3051-YARN-2928.06.patch, > YARN-3051-YARN-2928.07.patch, YARN-3051-YARN-2928.08.patch, > YARN-3051.Reader_API.patch, YARN-3051.Reader_API_1.patch, > YARN-3051.Reader_API_2.patch, YARN-3051.Reader_API_3.patch, > YARN-3051.Reader_API_4.patch, YARN-3051.wip.02.YARN-2928.patch, > YARN-3051.wip.patch, YARN-3051_temp.patch > > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14615825#comment-14615825 ] Varun Saxena commented on YARN-3051: Is this approach fine or you prefer having a single app flow mapping file. I segregated it with the intention of reducing the number of records to read as well. But that will be less of a concern once we cache it. > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: YARN-3051-YARN-2928.003.patch, > YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, > YARN-3051-YARN-2928.05.patch, YARN-3051-YARN-2928.06.patch, > YARN-3051-YARN-2928.07.patch, YARN-3051-YARN-2928.08.patch, > YARN-3051.Reader_API.patch, YARN-3051.Reader_API_1.patch, > YARN-3051.Reader_API_2.patch, YARN-3051.Reader_API_3.patch, > YARN-3051.Reader_API_4.patch, YARN-3051.wip.02.YARN-2928.patch, > YARN-3051.wip.patch, YARN-3051_temp.patch > > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14615839#comment-14615839 ] Zhijie Shen commented on YARN-3051: --- Okay, then it seems to be fine. I didn't notice it's per cluster based mapping file. > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: YARN-3051-YARN-2928.003.patch, > YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, > YARN-3051-YARN-2928.05.patch, YARN-3051-YARN-2928.06.patch, > YARN-3051-YARN-2928.07.patch, YARN-3051-YARN-2928.08.patch, > YARN-3051.Reader_API.patch, YARN-3051.Reader_API_1.patch, > YARN-3051.Reader_API_2.patch, YARN-3051.Reader_API_3.patch, > YARN-3051.Reader_API_4.patch, YARN-3051.wip.02.YARN-2928.patch, > YARN-3051.wip.patch, YARN-3051_temp.patch > > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14615930#comment-14615930 ] Zhijie Shen commented on YARN-3051: --- Will commit the patch late today if no more comments. > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: YARN-3051-YARN-2928.003.patch, > YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, > YARN-3051-YARN-2928.05.patch, YARN-3051-YARN-2928.06.patch, > YARN-3051-YARN-2928.07.patch, YARN-3051-YARN-2928.08.patch, > YARN-3051.Reader_API.patch, YARN-3051.Reader_API_1.patch, > YARN-3051.Reader_API_2.patch, YARN-3051.Reader_API_3.patch, > YARN-3051.Reader_API_4.patch, YARN-3051.wip.02.YARN-2928.patch, > YARN-3051.wip.patch, YARN-3051_temp.patch > > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14615941#comment-14615941 ] Sangjin Lee commented on YARN-3051: --- +1 from me. Thanks! > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: YARN-3051-YARN-2928.003.patch, > YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, > YARN-3051-YARN-2928.05.patch, YARN-3051-YARN-2928.06.patch, > YARN-3051-YARN-2928.07.patch, YARN-3051-YARN-2928.08.patch, > YARN-3051.Reader_API.patch, YARN-3051.Reader_API_1.patch, > YARN-3051.Reader_API_2.patch, YARN-3051.Reader_API_3.patch, > YARN-3051.Reader_API_4.patch, YARN-3051.wip.02.YARN-2928.patch, > YARN-3051.wip.patch, YARN-3051_temp.patch > > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14616218#comment-14616218 ] Varun Saxena commented on YARN-3051: Thanks [~zjshen] for the commit. Thanks [~zjshen], [~sjlee0], [~gtCarrera9] and [~jrottinghuis] for the review. > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena > Fix For: YARN-2928 > > Attachments: YARN-3051-YARN-2928.003.patch, > YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, > YARN-3051-YARN-2928.05.patch, YARN-3051-YARN-2928.06.patch, > YARN-3051-YARN-2928.07.patch, YARN-3051-YARN-2928.08.patch, > YARN-3051.Reader_API.patch, YARN-3051.Reader_API_1.patch, > YARN-3051.Reader_API_2.patch, YARN-3051.Reader_API_3.patch, > YARN-3051.Reader_API_4.patch, YARN-3051.wip.02.YARN-2928.patch, > YARN-3051.wip.patch, YARN-3051_temp.patch > > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384389#comment-14384389 ] Li Lu commented on YARN-3051: - Hi [~varun_saxena], any progress on the reader API side for now? The new reader API is blocking our storage implementations, so if you have any bandwidth problems feel free to let us know. I can take it over if necessary. > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Varun Saxena > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384422#comment-14384422 ] Varun Saxena commented on YARN-3051: I am relatively free this weekend. So will be able to work on this on priority. Will let you know if I run into bandwidth issues. We had decided on below three APIs' which are somewhat similar to what existed in ATS v1. Now, as you mentioned in comment elsewhere we need to support metrics too. So, what kind of queries have we decided to support ? For instance, queries such as get apps which have a particular metric's value less than or greater than something ? {code} TimelineEntities getEntities(String entityType, long limit, long windowStart, Long windowEnd, String fromId, long fromTs, Collection filters, EnumSet fieldsToRetrieve) throws IOException; TimelineEntity getEntity(String entityId, String entityType, EnumSet fieldsToRetrieve) throws IOException; TimelineEvents getEntityTimelines(String entityType, SortedSet entityIds, long limit, long windowStart, long windowEnd, Set eventTypes) throws IOException; {code} > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Varun Saxena > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384566#comment-14384566 ] Li Lu commented on YARN-3051: - bq. We had decided on below three APIs' which are somewhat similar to what existed in ATS v1. Isn't that what we already have in YARN-3047? > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Varun Saxena > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384578#comment-14384578 ] Varun Saxena commented on YARN-3051: No...I had initially kept it there but later moved it out so that store implementation can be in YARN-3051. This JIRA will have File System implementation. > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Varun Saxena > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384615#comment-14384615 ] Sangjin Lee commented on YARN-3051: --- A couple of things to discuss: In principle, a *shallow* view of the entity will be returned by default, right? Specifically, I'm wondering whether all configs and metrics should be included in the default view or not. I wonder what ATS v.1 does in this regard? FYI, I believe most of the YARN REST API returns a shall view of objects. Note that the size of the responses could become quite big if we include configs and metrics by default. On a related note, if we decide to return shallow views by default, then the question is, how do we ask the reader to get things like configs and metrics? The reader API as well as the reader storage interface should be able to support calls to retrieve config/metrics, perhaps with new methods. bq. For instance, queries such as get apps which have a particular metric's value less than or greater than something ? Metric/config-based queries will probably need changes to the API. We would want to be able to queries like "return apps where config X = Y" or "return apps where metric A > B". But we can consider them advanced queries. > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Varun Saxena > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384634#comment-14384634 ] Varun Saxena commented on YARN-3051: That is what was initially decided. We can handle file system implementation in another JIRA as well. But as File System implementation will be the default, we thought we can handle it here > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Varun Saxena > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384656#comment-14384656 ] Varun Saxena commented on YARN-3051: configs and metrics will be retrieved as part of an entity. We can filter out which fields to retrieve based on {{EnumSet fieldsToRetrieve}}. null means all fields will be retrieved. So if we do not want all configs and metrics, we can leave them out and mention other fields in fieldsToRetrieve. This can be mentioned in the REST URL as {{fields=}} > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Varun Saxena > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384675#comment-14384675 ] Li Lu commented on YARN-3051: - bq. configs and metrics will be retrieved as part of an entity. The most significant concern here is the size of configs and metrics. I think that's why [~sjlee0] is proposing a shallow view here. Still waiting for [~zjshen]'s confirmation for v1, but for v2 I think we may need something like this. > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Varun Saxena > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384684#comment-14384684 ] Varun Saxena commented on YARN-3051: Keeping this in mind, do you think a new method will be required to fetch config and metrics ? I guess not. > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Varun Saxena > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384699#comment-14384699 ] Varun Saxena commented on YARN-3051: To elaborate on what getEntities API will do. It will support filters similar to secondary filters by matching the info field. Yes, API would need to be enhanced to support queries based on config and metrics. I think it can be part of the same getEntities API. As mentioned above, for config equality can be checked and for metrics all the relational operators will have to be supported. We can probably have 2 additional parameters in the API, namely configFilters and metricsFilters. I guess that should do. I dont think there will be any other field on the basis of which filtering will be done. > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Varun Saxena > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384698#comment-14384698 ] Varun Saxena commented on YARN-3051: Yeah I meant it can still be supported if client mentions which fields are to be retrieved. But I do understand the concern here. The default view should return all fields except configs and metrics. > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Varun Saxena > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384709#comment-14384709 ] Varun Saxena commented on YARN-3051: For the point about shallow view of entity, we can then say if {{fieldsToRetrieve}} is null i.e. client does not specify which fields to retrieve, store implementation will return all fields except configs and metrics. I can add another special field called "all" which would indicate all fields will have to be retrieved. So if client specifies fields=all in REST URL, storage implementation will fetch all the fields. Thoughts ? > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Varun Saxena > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384761#comment-14384761 ] Varun Saxena commented on YARN-3051: bq. We may not only need to do queries for timeline entities, but also something solely for their configs and/or metrics But IIUC, metrics and configs would still be tied to or encapsulated inside an entity. The entity may be a cluster or it may be an application or something else. So when I say get all configs for an app. I do that by specifying fields=configs in REST URL. And if I want metrics and configs for an app, I can say fields=configs,metrics. > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Varun Saxena > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384783#comment-14384783 ] Li Lu commented on YARN-3051: - bq. So when I say get all configs for an app. I do that by specifying fields=configs in REST URL. And if I want metrics and configs for an app, I can say fields=configs,metrics. OK, I'm just thinking out loud. So do we need to touch both the entity table and the config/metric table on the underlying storage? Now suppose I've already have a timeline entity, without its metrics, and I'd like to draw a time series for its hdfs_bytes_write. Do I need to regenerate the timeline entity together with the metric, or I can say something like "get hdfs_bytes_write for this context"? BTW, we may want to consider the relationship between the context and timeline entities on the reader side. The context information is the PK of the timeline entity rows. > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Varun Saxena > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384785#comment-14384785 ] Varun Saxena commented on YARN-3051: Just to elaborate further, below API will be used to serve the use case above. {code} TimelineEntity getEntity(String entityId, String entityType, EnumSet fieldsToRetrieve) {code} Assuming entityid will be same as appid if entity type is "application", we can fetch configs for application_12345_0001 like below : {{http:///application/application_12345_0001?fields=configs}} > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Varun Saxena > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384788#comment-14384788 ] Varun Saxena commented on YARN-3051: Hmm...If you don't mind can you share the schema decided for phoenix based storage. That will be helpful in designing the API. > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Varun Saxena > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384792#comment-14384792 ] Li Lu commented on YARN-3051: - Sure. Will post it soon. > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Varun Saxena > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384797#comment-14384797 ] Varun Saxena commented on YARN-3051: Thanks. > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Varun Saxena > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384796#comment-14384796 ] Li Lu commented on YARN-3051: - BTW, the reader APIs are not only for the Phoenix storage itself. We also need to consider the hbase implementation. On the design side, we may want to consider the common strategies, and I don't think a single storage implementation would block the progress of this JIRA. > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Varun Saxena > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384802#comment-14384802 ] Varun Saxena commented on YARN-3051: Yeah it should not block progress of this JIRA. Was just trying to understand your use case better. > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Varun Saxena > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14385025#comment-14385025 ] Sangjin Lee commented on YARN-3051: --- Wanted to add my 2 cents before the weekend. As for the default view of an entity, are we agreed then that it means all the data at the level of the entity but *not* going into config/metrics/info? I want to stress that this default behavior should be explicit in the code so there is no confusion. I think it's up to us to define in terms of APIs how to best capture all the query use cases. If it can be worked through fieldsToRetrieve, that is fine. We need to make sure the APIs are clear in terms of what they do. The following are the types of queries that I can think of this storage reader API (and the reader itself) would need to support. This is not an exhaustive list. There may be more. But at least these need to be supported well: - given an id, return the entity (default; see above) - given an id, return all metrics of the entity - given an id, return the entire config of the entity - given an id, return the entity along with metrics/configs/info - (optional?) given an id, return one metric or some metrics (by name) of the entity (possibly retrieving the time series of its values) - (optional?) given an id, return one of some config entries (by name) of the entity - (need to give some more thoughts) relational queries (e.g. given an app id, return the app entity along with its containers) Again, this is not an exhaustive list, or even a completely thought-out list. But it should give us some idea on how to define the APIs. Hope this helps... > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Varun Saxena > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)