[ https://issues.apache.org/jira/browse/YARN-5105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15301193#comment-15301193 ]
Joep Rottinghuis commented on YARN-5105: ---------------------------------------- While I agree that we can postpone the decision whether to add more complexity later as to adding a time range and/or a count range, I feel we need to leave the door open to do so. So that leads me to a slightly different opinion on adding a single boolean attribute isTimeSeries yes/no type of argument. If we think forward, how would that work with isTimeSeries? Would we then have both a time range and we'd mandate multiple values with "isTimeSeries"? In addition, just the boolean doesn't immediately invoke the sense that if you say false that you get 1 value (the latest one) back, versus getting skipping metrics altogether. I think we can already do that by specifying fields to retrieve. Read for example the javadoc on TimelineDataToRetrieve: {code} * <li><b>isTimeSeries</b> - If fieldsToRetrieve contains METRICS/ALL or 57 * metricsToRetrieve is specified, this boolean flag indicates whether a time 58 * series needs to be returned for these metrics. The flag is ignored if METRICS 59 * are not to be fetched.</li> {code} It isn't quite clear that 1 row is returned if isTimeSeries is false. Admittedly, TimelineReaderWebServices is a bit more explicit: {code} 257 * @param timeSeries If specified, defines whether a metric time series needs 258 * to be returned if fields contains METRICS/ALL or metricsToRetrieve is 259 * specified. Ignored otherwise. If value is true, means time series will 260 * be returned. All other values will be treated as false, including when 261 * this parameter is unspecified. In such cases, latest single value of 262 * metric(s) will be returned (Optional query param). {code} It still a little confusing. Given that we already have the concept of limit to limit the # entities we return, why don't change the timeseries argument from boolean to a timeserieslimit. We'd document that the default is 1 and that -1 means no limit (ie retrieve the entire time series). Furthermore we can specify for now that the only two values allowed are -1 and 1. In other words, -1 is no limit, or else only one record is returned. The query limiting maps relatively neatly to the HBase get. ApplicationEntityReader. getResults in your latest patch was: {code} 315 if (getDataToRetrieve().isTimeSeries()) { 316 get.setMaxVersions(Integer.MAX_VALUE); 317 } {code} and would become: {code} 315 if (getDataToRetrieve().getTimeSeriesLimit() >= 0) { 316 get.setMaxVersions(getDataToRetrieve().getTimeSeriesLimit()); 317 } {code} I agree that we shouldn't try to distinguish between separate limits for separate columns for now to keep things simple. Now if we were to add the time range to further give flexibility to limit which records are retrieved, that would be relatively orthogonal to timeSeriesLimit. We'd simply return the last # metrics (per column) that fall within the specified range. > entire time series is returned for YARN container system metrics (CPU and > memory) > --------------------------------------------------------------------------------- > > Key: YARN-5105 > URL: https://issues.apache.org/jira/browse/YARN-5105 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver > Affects Versions: YARN-2928 > Reporter: Sangjin Lee > Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-5105-YARN-2928.01.patch, > YARN-5105-YARN-2928.02.patch, YARN-5105-YARN-2928.03.patch > > > I see that the entire time series of the CPU and memory metrics are returned > for the YARN containers REST query. This has a potential of bloating the > output big time. > {noformat} > "metrics": [ > { > "type": "TIME_SERIES", > "id": "MEMORY", > "values": > { > "1463518173363": 407539712, > "1463518170347": 407539712, > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org