[jira] [Commented] (YARN-4179) [reader implementation] support flow activity queries based on time
[ https://issues.apache.org/jira/browse/YARN-4179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15369788#comment-15369788 ] Hudson commented on YARN-4179: -- SUCCESS: Integrated in Hadoop-trunk-Commit #10074 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/10074/]) YARN-4179. [reader implementation] support flow activity queries based (sjlee: rev e3e857866d9fdefb7e353b21ae24eab4401e60b3) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/FlowActivityEntityReader.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/flow/FlowActivityRowKey.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/reader/TimelineReaderWebServices.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/test/java/org/apache/hadoop/yarn/server/timelineservice/reader/TestTimelineReaderWebServicesHBaseStorage.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/timelineservice/FlowActivityEntity.java > [reader implementation] support flow activity queries based on time > --- > > Key: YARN-4179 > URL: https://issues.apache.org/jira/browse/YARN-4179 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena >Priority: Minor > Fix For: YARN-2928 > > Attachments: YARN-4179-YARN-2928.01.patch, > YARN-4179-YARN-2928.02.patch, YARN-4179-YARN-2928.03.patch > > > This came up as part of YARN-4074 and YARN-4075. > Currently the only query pattern that's supported on the flow activity table > is by cluster only. But it might be useful to support queries by cluster and > certain date or dates. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4179) [reader implementation] support flow activity queries based on time
[ https://issues.apache.org/jira/browse/YARN-4179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14970264#comment-14970264 ] Varun Saxena commented on YARN-4179: Thanks [~sjlee0] for the commit. > [reader implementation] support flow activity queries based on time > --- > > Key: YARN-4179 > URL: https://issues.apache.org/jira/browse/YARN-4179 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena >Priority: Minor > Fix For: YARN-2928 > > Attachments: YARN-4179-YARN-2928.01.patch, > YARN-4179-YARN-2928.02.patch, YARN-4179-YARN-2928.03.patch > > > This came up as part of YARN-4074 and YARN-4075. > Currently the only query pattern that's supported on the flow activity table > is by cluster only. But it might be useful to support queries by cluster and > certain date or dates. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4179) [reader implementation] support flow activity queries based on time
[ https://issues.apache.org/jira/browse/YARN-4179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14969456#comment-14969456 ] Sangjin Lee commented on YARN-4179: --- Thanks for updating the patch [~varun_saxena]. I'm +1 on the latest patch (v.3). I'll let others comment on this for today before I commit it. Thanks! > [reader implementation] support flow activity queries based on time > --- > > Key: YARN-4179 > URL: https://issues.apache.org/jira/browse/YARN-4179 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena >Priority: Minor > Attachments: YARN-4179-YARN-2928.01.patch, > YARN-4179-YARN-2928.02.patch, YARN-4179-YARN-2928.03.patch > > > This came up as part of YARN-4074 and YARN-4075. > Currently the only query pattern that's supported on the flow activity table > is by cluster only. But it might be useful to support queries by cluster and > certain date or dates. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4179) [reader implementation] support flow activity queries based on time
[ https://issues.apache.org/jira/browse/YARN-4179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14967556#comment-14967556 ] Hadoop QA commented on YARN-4179: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | pre-patch | 18m 12s | Findbugs (version ) appears to be broken on YARN-2928. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 8m 10s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 10m 36s | There were no new javadoc warning messages. | | {color:red}-1{color} | release audit | 0m 18s | The applied patch generated 1 release audit warnings. | | {color:green}+1{color} | checkstyle | 1m 25s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 2s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 34s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 41s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 2m 30s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | yarn tests | 0m 24s | Tests passed in hadoop-yarn-api. | | {color:green}+1{color} | yarn tests | 2m 45s | Tests passed in hadoop-yarn-server-timelineservice. | | | | 46m 45s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12767811/YARN-4179-YARN-2928.03.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | YARN-2928 / 581a6b6 | | Release Audit | https://builds.apache.org/job/PreCommit-YARN-Build/9510/artifact/patchprocess/patchReleaseAuditProblems.txt | | hadoop-yarn-api test log | https://builds.apache.org/job/PreCommit-YARN-Build/9510/artifact/patchprocess/testrun_hadoop-yarn-api.txt | | hadoop-yarn-server-timelineservice test log | https://builds.apache.org/job/PreCommit-YARN-Build/9510/artifact/patchprocess/testrun_hadoop-yarn-server-timelineservice.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/9510/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/9510/console | This message was automatically generated. > [reader implementation] support flow activity queries based on time > --- > > Key: YARN-4179 > URL: https://issues.apache.org/jira/browse/YARN-4179 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena >Priority: Minor > Attachments: YARN-4179-YARN-2928.01.patch, > YARN-4179-YARN-2928.02.patch, YARN-4179-YARN-2928.03.patch > > > This came up as part of YARN-4074 and YARN-4075. > Currently the only query pattern that's supported on the flow activity table > is by cluster only. But it might be useful to support queries by cluster and > certain date or dates. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4179) [reader implementation] support flow activity queries based on time
[ https://issues.apache.org/jira/browse/YARN-4179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14966200#comment-14966200 ] Sangjin Lee commented on YARN-4179: --- The latest patch looks pretty good. Only a couple of minor comments. (TimelineReaderWebServices.java) - l.125: nit: "daterange" -> "date range" (a couple of other places too) - l.123-140: I'm pretty sure the logic is correct and does what we intend, but it could use some comments to make it easier to read later. For example, l.138-139 could have the comment that says it is dealing with the case where a single date (without "-") was specified, and so on. > [reader implementation] support flow activity queries based on time > --- > > Key: YARN-4179 > URL: https://issues.apache.org/jira/browse/YARN-4179 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena >Priority: Minor > Attachments: YARN-4179-YARN-2928.01.patch, > YARN-4179-YARN-2928.02.patch > > > This came up as part of YARN-4074 and YARN-4075. > Currently the only query pattern that's supported on the flow activity table > is by cluster only. But it might be useful to support queries by cluster and > certain date or dates. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4179) [reader implementation] support flow activity queries based on time
[ https://issues.apache.org/jira/browse/YARN-4179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14965691#comment-14965691 ] Hadoop QA commented on YARN-4179: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | pre-patch | 19m 44s | Findbugs (version ) appears to be broken on YARN-2928. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 8m 16s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 10m 49s | There were no new javadoc warning messages. | | {color:red}-1{color} | release audit | 0m 20s | The applied patch generated 1 release audit warnings. | | {color:green}+1{color} | checkstyle | 1m 27s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 2s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 40s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 44s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 2m 31s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | yarn tests | 0m 25s | Tests passed in hadoop-yarn-api. | | {color:green}+1{color} | yarn tests | 3m 31s | Tests passed in hadoop-yarn-server-timelineservice. | | | | 49m 41s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12767650/YARN-4179-YARN-2928.02.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | YARN-2928 / 581a6b6 | | Release Audit | https://builds.apache.org/job/PreCommit-YARN-Build/9495/artifact/patchprocess/patchReleaseAuditProblems.txt | | hadoop-yarn-api test log | https://builds.apache.org/job/PreCommit-YARN-Build/9495/artifact/patchprocess/testrun_hadoop-yarn-api.txt | | hadoop-yarn-server-timelineservice test log | https://builds.apache.org/job/PreCommit-YARN-Build/9495/artifact/patchprocess/testrun_hadoop-yarn-server-timelineservice.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/9495/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/9495/console | This message was automatically generated. > [reader implementation] support flow activity queries based on time > --- > > Key: YARN-4179 > URL: https://issues.apache.org/jira/browse/YARN-4179 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena >Priority: Minor > Attachments: YARN-4179-YARN-2928.01.patch, > YARN-4179-YARN-2928.02.patch > > > This came up as part of YARN-4074 and YARN-4075. > Currently the only query pattern that's supported on the flow activity table > is by cluster only. But it might be useful to support queries by cluster and > certain date or dates. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4179) [reader implementation] support flow activity queries based on time
[ https://issues.apache.org/jira/browse/YARN-4179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14961238#comment-14961238 ] Sangjin Lee commented on YARN-4179: --- bq. Now the question comes should we then have a daterange like "-20151001" or support 2 query params. Will go with former as of now. +1. I think we can stick with a single param, whereas just "20151001" is interpreted to pick only that date. > [reader implementation] support flow activity queries based on time > --- > > Key: YARN-4179 > URL: https://issues.apache.org/jira/browse/YARN-4179 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena >Priority: Minor > Attachments: YARN-4179-YARN-2928.01.patch > > > This came up as part of YARN-4074 and YARN-4075. > Currently the only query pattern that's supported on the flow activity table > is by cluster only. But it might be useful to support queries by cluster and > certain date or dates. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4179) [reader implementation] support flow activity queries based on time
[ https://issues.apache.org/jira/browse/YARN-4179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14960244#comment-14960244 ] Varun Saxena commented on YARN-4179: bq. There are ways to handle this easily without incurring synchronization overhead however. Hmm...I see. Although FastDateFormat claims to be faster in all possible cases of using SimpleDateFormat but this is unlikely to be a bottleneck in our case especially while using ThreadLocal. Will change. bq. We can discuss whether it's worth supporting them. Yeah we can discuss regarding them. What I was thinking was not to make them open ended(albeit the patch missed some sanity checks). But on second thoughts, I think keeping it open ended with limits maybe fine as you said. Now the question comes should we then have a daterange like "-20151001" or support 2 query params. Will go with former as of now. Some additional checks(in addition to what you mentioned) are needed in the patch including whether date is in correct format or not. I think if we set SimpleDateFormat#setLenient as false, that will be taken care. Will upload a patch with changes for further review. > [reader implementation] support flow activity queries based on time > --- > > Key: YARN-4179 > URL: https://issues.apache.org/jira/browse/YARN-4179 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena >Priority: Minor > Attachments: YARN-4179-YARN-2928.01.patch > > > This came up as part of YARN-4074 and YARN-4075. > Currently the only query pattern that's supported on the flow activity table > is by cluster only. But it might be useful to support queries by cluster and > certain date or dates. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4179) [reader implementation] support flow activity queries based on time
[ https://issues.apache.org/jira/browse/YARN-4179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14959590#comment-14959590 ] Sangjin Lee commented on YARN-4179: --- BTW, another thing: we should verify the start date is earlier than the end date. I don't think we're checking that. > [reader implementation] support flow activity queries based on time > --- > > Key: YARN-4179 > URL: https://issues.apache.org/jira/browse/YARN-4179 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena >Priority: Minor > Attachments: YARN-4179-YARN-2928.01.patch > > > This came up as part of YARN-4074 and YARN-4075. > Currently the only query pattern that's supported on the flow activity table > is by cluster only. But it might be useful to support queries by cluster and > certain date or dates. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4179) [reader implementation] support flow activity queries based on time
[ https://issues.apache.org/jira/browse/YARN-4179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14959587#comment-14959587 ] Sangjin Lee commented on YARN-4179: --- Yes, I know that java's SimpleDateFormat is not thread safe. There are ways to handle this easily without incurring synchronization overhead however. One pattern is {code} static ThreadLocal DATE_FORMAT = new ThreadLocal<>() { @Override protected DateFormat initialValue() { return new SimpleDateFormat(...); } }; {code} {quote} When Date is converted to JSON, it is represented as a long. Hence when JSON parsing is done at the client side, getInfo().get(DATE_INFO_KEY) returns a long. That is why the conversion. {quote} Got it. Thanks for the clarification. {quote} And I have chosen a single query param daterange(delimited by "-") i.e. a specific date or a range. If we want to specify a startdate and enddate we will need 2 query params. If startdate is not specified, every date starting from 1970 till enddate can be taken(constrained by limit) and if enddate isnt specified every date from startdate till today can be taken. Do you want this approach ? {quote} So "20151001-20151031" would return all records between 10/1 and 10/31 (both inclusive), right? And "20151001" would return records only for that date. Is either "20151001-" or "-20151001" legal? If so, what would they do? The same as "20151001" (I suspect)? I am fine with that approach, but I think there is a little bit of additional value in interpreting "20151001-" to mean "20151001-(now)", and similarly "-20151001" to mean "(ages ago)-20151001" (I wasn't suggesting having 2 query params; just a different interpretation of those values). We can discuss whether it's worth supporting them. IMO open-ended queries do not make things worse. Note that the limit is always used (even if the user did not provide one). Users can even query without any date range which is open-ended on both sides. The limit is what makes the queries sane. The date range queries would always be more constraining queries than those without. > [reader implementation] support flow activity queries based on time > --- > > Key: YARN-4179 > URL: https://issues.apache.org/jira/browse/YARN-4179 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena >Priority: Minor > Attachments: YARN-4179-YARN-2928.01.patch > > > This came up as part of YARN-4074 and YARN-4075. > Currently the only query pattern that's supported on the flow activity table > is by cluster only. But it might be useful to support queries by cluster and > certain date or dates. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4179) [reader implementation] support flow activity queries based on time
[ https://issues.apache.org/jira/browse/YARN-4179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14959442#comment-14959442 ] Varun Saxena commented on YARN-4179: Typos in my first comment so writing it again. bq. As I mentioned in an earlier comment, can we simply use JDK's date formatter for this? It would be good to avoid adding a new dependency unless it is absolutely necessary. JAVAs' SimpleDateFormat is not thread safe. Apache commons-lang's FastDateFormat claims to be 25% faster too. Used commons-lang3 because commons-lang's FastDateFormat does not provide date parsing capability. If you want to avoid adding a dependency and we have decided to use a single date format(MMdd), I can write a custom method to do date parsing as well and convert it into seconds since epoch. > [reader implementation] support flow activity queries based on time > --- > > Key: YARN-4179 > URL: https://issues.apache.org/jira/browse/YARN-4179 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena >Priority: Minor > Attachments: YARN-4179-YARN-2928.01.patch > > > This came up as part of YARN-4074 and YARN-4075. > Currently the only query pattern that's supported on the flow activity table > is by cluster only. But it might be useful to support queries by cluster and > certain date or dates. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4179) [reader implementation] support flow activity queries based on time
[ https://issues.apache.org/jira/browse/YARN-4179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14959439#comment-14959439 ] Varun Saxena commented on YARN-4179: [~sjlee0] bq. As I mentioned in an earlier comment, can we simply use JDK's date formatter for this? It would be good to avoid adding a new dependency unless it is absolutely necessary. JAVAs' {{SimpleDateFormat}} is not thread safe. Apache commons-lang's {{FastDateFormat}} claims to be 25% faster too. Used commons-lang3 because commons-lang's {{FastDateFormat}} does not provide date parsing capability. If we you want to avoid adding a dependency and the we will use a single date format, I can write a custom method to do date parsing as well and convert it into seconds since epoch. bq. now the question of a more "standard" date format can be highly subjective, but how about using "MMdd"? Ok. We can go with this format. bq. instead of duplicating the date format here, we should simply reuse what's defined in TimelineReaderWebServices Ok. bq. Why is this conversion from long to Date needed? When Date is converted to JSON, it is represented as a long. Hence when JSON parsing is done at the client side, {{getInfo().get(DATE_INFO_KEY)}} returns a long. That is why the conversion. bq. let's use a normal java predicate style date != null Ok. bq. nit: parseDate can return primitive long Ok. Regarding your other comments, I will add a javadoc. And I have chosen a single query param daterange(delimited by "-") i.e. a specific date or a range. If we want to specify a startdate and enddate we will need 2 query params. If startdate is not specified, every date starting from 1970 till enddate can be taken(constrained by limit) and if enddate isnt specified every date from startdate till today can be taken. Do you want this approach ? > [reader implementation] support flow activity queries based on time > --- > > Key: YARN-4179 > URL: https://issues.apache.org/jira/browse/YARN-4179 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena >Priority: Minor > Attachments: YARN-4179-YARN-2928.01.patch > > > This came up as part of YARN-4074 and YARN-4075. > Currently the only query pattern that's supported on the flow activity table > is by cluster only. But it might be useful to support queries by cluster and > certain date or dates. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4179) [reader implementation] support flow activity queries based on time
[ https://issues.apache.org/jira/browse/YARN-4179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14959327#comment-14959327 ] Hadoop QA commented on YARN-4179: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | pre-patch | 22m 59s | Findbugs (version ) appears to be broken on YARN-2928. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 10m 20s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 14m 10s | There were no new javadoc warning messages. | | {color:red}-1{color} | release audit | 0m 26s | The applied patch generated 1 release audit warnings. | | {color:green}+1{color} | checkstyle | 1m 48s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 2s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 2m 4s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 57s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 3m 25s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | yarn tests | 0m 33s | Tests passed in hadoop-yarn-api. | | {color:green}+1{color} | yarn tests | 3m 52s | Tests passed in hadoop-yarn-server-timelineservice. | | | | 60m 45s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12765893/YARN-4179-YARN-2928.01.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | YARN-2928 / bd5af9c | | Release Audit | https://builds.apache.org/job/PreCommit-YARN-Build/9457/artifact/patchprocess/patchReleaseAuditProblems.txt | | hadoop-yarn-api test log | https://builds.apache.org/job/PreCommit-YARN-Build/9457/artifact/patchprocess/testrun_hadoop-yarn-api.txt | | hadoop-yarn-server-timelineservice test log | https://builds.apache.org/job/PreCommit-YARN-Build/9457/artifact/patchprocess/testrun_hadoop-yarn-server-timelineservice.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/9457/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf907.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/9457/console | This message was automatically generated. > [reader implementation] support flow activity queries based on time > --- > > Key: YARN-4179 > URL: https://issues.apache.org/jira/browse/YARN-4179 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena >Priority: Minor > Attachments: YARN-4179-YARN-2928.01.patch > > > This came up as part of YARN-4074 and YARN-4075. > Currently the only query pattern that's supported on the flow activity table > is by cluster only. But it might be useful to support queries by cluster and > certain date or dates. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4179) [reader implementation] support flow activity queries based on time
[ https://issues.apache.org/jira/browse/YARN-4179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14959321#comment-14959321 ] Sangjin Lee commented on YARN-4179: --- (FlowActivityEntity.java) - l.143-151: Why is this conversion from long to {{Date}} needed? I thought we always set a {{Date}} object via {{setDate}}? Did you see a case where a long was set? On a related note, do you think it would be easier if we set the long value (unix epoch) instead of the Date object to begin with? Might that make things a little easier? But one way or another, I don't think we need branching code here when we control these instances via {{setDate}}. - l.144: nit (assuming this change is needed): let's use a normal java predicate style {{date != null}} (TimelineReaderWebServices.java) - l.81: As I mentioned in an earlier comment, can we simply use JDK's date formatter for this? It would be good to avoid adding a new dependency unless it is absolutely necessary. - l.82: now the question of a more "standard" date format can be highly subjective, but how about using "MMdd"? - l.88: It would be good to document in javadoc explicitly (see below for a little more discussion) what is an allowed date range (here and perhaps in the actual REST API methods). - l.97: nit: {{parseDate}} can return primitive long - l.107: I see that if the start date is missing we're ignoring the date range param altogether. Is this reasonable? Could there be a legitimate use case where one specifies only the end date (perhaps along with a limit)? Should we not support it? - l.115: I'm a little confused by this. If the end date is missing, shouldn't we assume that it is open-ended (all the way to now)? Or are we interpreting that as to mean the user wants that specific date and that date only? This also shows that we need more comments here to specify the behavior explicitly. :) (TestTimelineReaderWebServicesHBaseStorage.java) - l.406: instead of duplicating the date format here, we should simply reuse what's defined in TimelineReaderWebServices, making that variable public or package-scope > [reader implementation] support flow activity queries based on time > --- > > Key: YARN-4179 > URL: https://issues.apache.org/jira/browse/YARN-4179 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena >Priority: Minor > Attachments: YARN-4179-YARN-2928.01.patch > > > This came up as part of YARN-4074 and YARN-4075. > Currently the only query pattern that's supported on the flow activity table > is by cluster only. But it might be useful to support queries by cluster and > certain date or dates. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4179) [reader implementation] support flow activity queries based on time
[ https://issues.apache.org/jira/browse/YARN-4179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14959260#comment-14959260 ] Sangjin Lee commented on YARN-4179: --- >From the user's point of view, I think it makes the most sense to use a date >format and assume GMT. In terms of tying it back to the flow activity record, we should be able to use the times coming from the user and converted with the date format, right? We just need to ensure it agrees with the "top-of-the-day" method we have so that if the user provided the right date it would select the records with that date. I'll go over the patch in more detail, but I noticed that we're introducing a new dependency for date parsing (commons-lang3). Is that necessary? Can this be accomplished with the existing JDK API and the old commons-lang? What is the gap? > [reader implementation] support flow activity queries based on time > --- > > Key: YARN-4179 > URL: https://issues.apache.org/jira/browse/YARN-4179 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena >Priority: Minor > Attachments: YARN-4179-YARN-2928.01.patch > > > This came up as part of YARN-4074 and YARN-4075. > Currently the only query pattern that's supported on the flow activity table > is by cluster only. But it might be useful to support queries by cluster and > certain date or dates. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4179) [reader implementation] support flow activity queries based on time
[ https://issues.apache.org/jira/browse/YARN-4179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14951052#comment-14951052 ] Varun Saxena commented on YARN-4179: The approach chosen in the patch is that date will be in format ddMM and timezone will be assumed as GMT. Now the issue here is should we fix the format ? Across geographies the popular date format varies. Another solution is that date range can be specified as seconds since epoch. Here the issue is that instead of a single date any timestamp can be specified within a day. We can although normalize the timestamp in date range by converting it into top of the day timestamp and then firing the query to backend. So would welcome views from others on this regarding which approach to follow. > [reader implementation] support flow activity queries based on time > --- > > Key: YARN-4179 > URL: https://issues.apache.org/jira/browse/YARN-4179 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena >Priority: Minor > Attachments: YARN-4179-YARN-2928.01.patch > > > This came up as part of YARN-4074 and YARN-4075. > Currently the only query pattern that's supported on the flow activity table > is by cluster only. But it might be useful to support queries by cluster and > certain date or dates. -- This message was sent by Atlassian JIRA (v6.3.4#6332)