[jira] [Commented] (YARN-4053) Change the way metric values are stored in HBase Storage
[ https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15369735#comment-15369735 ] Hudson commented on YARN-4053: -- SUCCESS: Integrated in Hadoop-trunk-Commit #10074 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/10074/]) YARN-4053. Change the way metric values are stored in HBase Storage (sjlee: rev 51254a6b5133c8abfec4b7d2ac9477d112b3ccfa) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/application/ApplicationColumnPrefix.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/common/ColumnHelper.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/common/package-info.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/entity/EntityColumn.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/test/java/org/apache/hadoop/yarn/server/timelineservice/storage/flow/TestHBaseStorageFlowRun.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/entity/EntityColumnPrefix.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/FlowRunEntityReader.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/common/ValueConverter.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/flow/FlowRunColumn.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/flow/FlowScanner.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/common/NumericValueConverter.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/test/java/org/apache/hadoop/yarn/server/timelineservice/storage/TestHBaseTimelineStorage.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/common/GenericConverter.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/flow/FlowRunColumnPrefix.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/common/LongConverter.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/common/TimelineStorageUtils.java > Change the way metric values are stored in HBase Storage > > > Key: YARN-4053 > URL: https://issues.apache.org/jira/browse/YARN-4053 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Fix For: YARN-2928 > > Attachments: YARN-4053-YARN-2928.01.patch, > YARN-4053-YARN-2928.02.patch, YARN-4053-feature-YARN-2928.03.patch, > YARN-4053-feature-YARN-2928.04.patch, YARN-4053-feature-YARN-2928.05.patch, > YARN-4053-feature-YARN-2928.06.patch, YARN-4053-feature-YARN-2928.07.patch, > YARN-4053-feature-YARN-2928.08.patch > > > Currently HBase implementation uses GenericObjectMapper to convert and store > values in backend HBase storage. This converts everything into a string > representation(ASCII/UTF-8 encoded byte array). > While this is fine in most cases, it does not quite serve our use case for > metrics. > So we need to decide how are we going to encode and decode metric values and > store them in HBase. > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4053) Change the way metric values are stored in HBase Storage
[ https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15018447#comment-15018447 ] Varun Saxena commented on YARN-4053: Thanks [~sjlee0] for the review and commit. And [~jrottinghuis] and [~vrushalic] for the reviews. > Change the way metric values are stored in HBase Storage > > > Key: YARN-4053 > URL: https://issues.apache.org/jira/browse/YARN-4053 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Fix For: YARN-2928 > > Attachments: YARN-4053-YARN-2928.01.patch, > YARN-4053-YARN-2928.02.patch, YARN-4053-feature-YARN-2928.03.patch, > YARN-4053-feature-YARN-2928.04.patch, YARN-4053-feature-YARN-2928.05.patch, > YARN-4053-feature-YARN-2928.06.patch, YARN-4053-feature-YARN-2928.07.patch, > YARN-4053-feature-YARN-2928.08.patch > > > Currently HBase implementation uses GenericObjectMapper to convert and store > values in backend HBase storage. This converts everything into a string > representation(ASCII/UTF-8 encoded byte array). > While this is fine in most cases, it does not quite serve our use case for > metrics. > So we need to decide how are we going to encode and decode metric values and > store them in HBase. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4053) Change the way metric values are stored in HBase Storage
[ https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15015332#comment-15015332 ] Hadoop QA commented on YARN-4053: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 12s {color} | {color:blue} docker + precommit patch detected. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 2 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 44s {color} | {color:green} feature-YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 37s {color} | {color:green} feature-YARN-2928 passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 25s {color} | {color:green} feature-YARN-2928 passed with JDK v1.7.0_85 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 16s {color} | {color:green} feature-YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 37s {color} | {color:green} feature-YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 24s {color} | {color:green} feature-YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 54s {color} | {color:green} feature-YARN-2928 passed {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 23s {color} | {color:red} hadoop-yarn-server-timelineservice in feature-YARN-2928 failed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 22s {color} | {color:green} feature-YARN-2928 passed with JDK v1.7.0_85 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 28s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 23s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 23s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 24s {color} | {color:green} the patch passed with JDK v1.7.0_85 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 24s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 13s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 31s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 20s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 51s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 20s {color} | {color:red} hadoop-yarn-server-timelineservice in the patch failed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 22s {color} | {color:green} the patch passed with JDK v1.7.0_85 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 46s {color} | {color:green} hadoop-yarn-server-timelineservice in the patch passed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 46s {color} | {color:green} hadoop-yarn-server-timelineservice in the patch passed with JDK v1.7.0_85. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 29s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 35m 3s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:date2015-11-20 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12773453/YARN-4053-feature-YARN-2928.08.patch | | JIRA Issue | YARN-4053 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 82b95b6d1a2e 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86
[jira] [Commented] (YARN-4053) Change the way metric values are stored in HBase Storage
[ https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15014822#comment-15014822 ] Sangjin Lee commented on YARN-4053: --- Could you address that new checkstyle violation? Then I think it's good to go. Thanks! > Change the way metric values are stored in HBase Storage > > > Key: YARN-4053 > URL: https://issues.apache.org/jira/browse/YARN-4053 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4053-YARN-2928.01.patch, > YARN-4053-YARN-2928.02.patch, YARN-4053-feature-YARN-2928.03.patch, > YARN-4053-feature-YARN-2928.04.patch, YARN-4053-feature-YARN-2928.05.patch, > YARN-4053-feature-YARN-2928.06.patch, YARN-4053-feature-YARN-2928.07.patch > > > Currently HBase implementation uses GenericObjectMapper to convert and store > values in backend HBase storage. This converts everything into a string > representation(ASCII/UTF-8 encoded byte array). > While this is fine in most cases, it does not quite serve our use case for > metrics. > So we need to decide how are we going to encode and decode metric values and > store them in HBase. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4053) Change the way metric values are stored in HBase Storage
[ https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15014815#comment-15014815 ] Hadoop QA commented on YARN-4053: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 8s {color} | {color:blue} docker + precommit patch detected. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 2 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 27s {color} | {color:green} feature-YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 17s {color} | {color:green} feature-YARN-2928 passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 19s {color} | {color:green} feature-YARN-2928 passed with JDK v1.7.0_85 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 13s {color} | {color:green} feature-YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 28s {color} | {color:green} feature-YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 23s {color} | {color:green} feature-YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 40s {color} | {color:green} feature-YARN-2928 passed {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 16s {color} | {color:red} hadoop-yarn-server-timelineservice in feature-YARN-2928 failed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 18s {color} | {color:green} feature-YARN-2928 passed with JDK v1.7.0_85 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 22s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 17s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 17s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 19s {color} | {color:green} the patch passed with JDK v1.7.0_85 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 19s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 11s {color} | {color:red} Patch generated 1 new checkstyle issues in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice (total was 42, now 38). {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 33s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 27s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 42s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 15s {color} | {color:red} hadoop-yarn-server-timelineservice in the patch failed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 19s {color} | {color:green} the patch passed with JDK v1.7.0_85 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 5s {color} | {color:green} hadoop-yarn-server-timelineservice in the patch passed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 55s {color} | {color:green} hadoop-yarn-server-timelineservice in the patch passed with JDK v1.7.0_85. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 25s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 29m 17s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:date2015-11-19 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12773366/YARN-4053-feature-YARN-2928.07.patch | | JIRA Issue | YARN-4053 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit
[jira] [Commented] (YARN-4053) Change the way metric values are stored in HBase Storage
[ https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15014673#comment-15014673 ] Varun Saxena commented on YARN-4053: This patch is similar to version 5 of the patch. No longer using generics. Removed ValueConverterImpl. And changed add method to {{Number add(Number num1, Number num2, Number...numbers)}} > Change the way metric values are stored in HBase Storage > > > Key: YARN-4053 > URL: https://issues.apache.org/jira/browse/YARN-4053 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4053-YARN-2928.01.patch, > YARN-4053-YARN-2928.02.patch, YARN-4053-feature-YARN-2928.03.patch, > YARN-4053-feature-YARN-2928.04.patch, YARN-4053-feature-YARN-2928.05.patch, > YARN-4053-feature-YARN-2928.06.patch, YARN-4053-feature-YARN-2928.07.patch > > > Currently HBase implementation uses GenericObjectMapper to convert and store > values in backend HBase storage. This converts everything into a string > representation(ASCII/UTF-8 encoded byte array). > While this is fine in most cases, it does not quite serve our use case for > metrics. > So we need to decide how are we going to encode and decode metric values and > store them in HBase. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4053) Change the way metric values are stored in HBase Storage
[ https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15014525#comment-15014525 ] Varun Saxena commented on YARN-4053: Ok. > Change the way metric values are stored in HBase Storage > > > Key: YARN-4053 > URL: https://issues.apache.org/jira/browse/YARN-4053 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4053-YARN-2928.01.patch, > YARN-4053-YARN-2928.02.patch, YARN-4053-feature-YARN-2928.03.patch, > YARN-4053-feature-YARN-2928.04.patch, YARN-4053-feature-YARN-2928.05.patch, > YARN-4053-feature-YARN-2928.06.patch > > > Currently HBase implementation uses GenericObjectMapper to convert and store > values in backend HBase storage. This converts everything into a string > representation(ASCII/UTF-8 encoded byte array). > While this is fine in most cases, it does not quite serve our use case for > metrics. > So we need to decide how are we going to encode and decode metric values and > store them in HBase. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4053) Change the way metric values are stored in HBase Storage
[ https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15014518#comment-15014518 ] Varun Saxena commented on YARN-4053: Just to clarify, by mixing I meant I was thinking of a method like {{T add(T num1, T num2, T...numbers)}} as minimum 2 numbers are required to add. Anyways if we change the method to original suggestion i.e. {{T add(Number...numbers)}}, it would work. But then we are not really utilizing generics here, atleast for add method. Let me know your thoughts on this. > Change the way metric values are stored in HBase Storage > > > Key: YARN-4053 > URL: https://issues.apache.org/jira/browse/YARN-4053 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4053-YARN-2928.01.patch, > YARN-4053-YARN-2928.02.patch, YARN-4053-feature-YARN-2928.03.patch, > YARN-4053-feature-YARN-2928.04.patch, YARN-4053-feature-YARN-2928.05.patch, > YARN-4053-feature-YARN-2928.06.patch > > > Currently HBase implementation uses GenericObjectMapper to convert and store > values in backend HBase storage. This converts everything into a string > representation(ASCII/UTF-8 encoded byte array). > While this is fine in most cases, it does not quite serve our use case for > metrics. > So we need to decide how are we going to encode and decode metric values and > store them in HBase. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4053) Change the way metric values are stored in HBase Storage
[ https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15014509#comment-15014509 ] Sangjin Lee commented on YARN-4053: --- Also, please look at the checkstyle violations. There are new ones that are introduced by this patch. > Change the way metric values are stored in HBase Storage > > > Key: YARN-4053 > URL: https://issues.apache.org/jira/browse/YARN-4053 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4053-YARN-2928.01.patch, > YARN-4053-YARN-2928.02.patch, YARN-4053-feature-YARN-2928.03.patch, > YARN-4053-feature-YARN-2928.04.patch, YARN-4053-feature-YARN-2928.05.patch, > YARN-4053-feature-YARN-2928.06.patch > > > Currently HBase implementation uses GenericObjectMapper to convert and store > values in backend HBase storage. This converts everything into a string > representation(ASCII/UTF-8 encoded byte array). > While this is fine in most cases, it does not quite serve our use case for > metrics. > So we need to decide how are we going to encode and decode metric values and > store them in HBase. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4053) Change the way metric values are stored in HBase Storage
[ https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15014497#comment-15014497 ] Sangjin Lee commented on YARN-4053: --- As mentioned above the generic version is good only if {{add()}} and {{compare()}} would operate only on return values of {{decodeValue()}}. I'm not sure if that's always a reliable assumption although it happens to be true right now. [~jrottinghuis]? Can we go back to non-generic version for the ValueConverter hierarchy? Sorry for going back and forth on this one! Just one more minor nit. In {{LongConverter}}, {{decodeValue()}} can simply return {{Bytes.toLong()}} without using {{Long.valueOf()}}. The same goes for {{add()}}. Autoboxing will take care of that for you. > Change the way metric values are stored in HBase Storage > > > Key: YARN-4053 > URL: https://issues.apache.org/jira/browse/YARN-4053 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4053-YARN-2928.01.patch, > YARN-4053-YARN-2928.02.patch, YARN-4053-feature-YARN-2928.03.patch, > YARN-4053-feature-YARN-2928.04.patch, YARN-4053-feature-YARN-2928.05.patch, > YARN-4053-feature-YARN-2928.06.patch > > > Currently HBase implementation uses GenericObjectMapper to convert and store > values in backend HBase storage. This converts everything into a string > representation(ASCII/UTF-8 encoded byte array). > While this is fine in most cases, it does not quite serve our use case for > metrics. > So we need to decide how are we going to encode and decode metric values and > store them in HBase. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4053) Change the way metric values are stored in HBase Storage
[ https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15014487#comment-15014487 ] Hadoop QA commented on YARN-4053: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 9s {color} | {color:blue} docker + precommit patch detected. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 2 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 11m 12s {color} | {color:green} feature-YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 29s {color} | {color:green} feature-YARN-2928 passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 27s {color} | {color:green} feature-YARN-2928 passed with JDK v1.7.0_85 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 15s {color} | {color:green} feature-YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 38s {color} | {color:green} feature-YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 21s {color} | {color:green} feature-YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 47s {color} | {color:green} feature-YARN-2928 passed {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 28s {color} | {color:red} hadoop-yarn-server-timelineservice in feature-YARN-2928 failed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s {color} | {color:green} feature-YARN-2928 passed with JDK v1.7.0_85 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 31s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 31s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 31s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 25s {color} | {color:green} the patch passed with JDK v1.7.0_85 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 25s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 15s {color} | {color:red} Patch generated 6 new checkstyle issues in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice (total was 43, now 45). {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 35s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 22s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 56s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 27s {color} | {color:red} hadoop-yarn-server-timelineservice in the patch failed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s {color} | {color:green} the patch passed with JDK v1.7.0_85 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 4m 25s {color} | {color:green} hadoop-yarn-server-timelineservice in the patch passed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 56s {color} | {color:green} hadoop-yarn-server-timelineservice in the patch passed with JDK v1.7.0_85. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 29s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 29m 56s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:date2015-11-19 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12773338/YARN-4053-feature-YARN-2928.06.patch | | JIRA Issue | YARN-4053 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit
[jira] [Commented] (YARN-4053) Change the way metric values are stored in HBase Storage
[ https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15014432#comment-15014432 ] Varun Saxena commented on YARN-4053: This patch removes ValueConverterImpl and uses generics as suggested by [~sjlee0]. Haven't added varargs in add method as varargs don't go well with generics and can lead to class cast exceptions. Maybe concrete class types can be used as varargs but that would mean mixing generic params with concrete type param(Number...) in the same method. So haven't added them > Change the way metric values are stored in HBase Storage > > > Key: YARN-4053 > URL: https://issues.apache.org/jira/browse/YARN-4053 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4053-YARN-2928.01.patch, > YARN-4053-YARN-2928.02.patch, YARN-4053-feature-YARN-2928.03.patch, > YARN-4053-feature-YARN-2928.04.patch, YARN-4053-feature-YARN-2928.05.patch, > YARN-4053-feature-YARN-2928.06.patch > > > Currently HBase implementation uses GenericObjectMapper to convert and store > values in backend HBase storage. This converts everything into a string > representation(ASCII/UTF-8 encoded byte array). > While this is fine in most cases, it does not quite serve our use case for > metrics. > So we need to decide how are we going to encode and decode metric values and > store them in HBase. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4053) Change the way metric values are stored in HBase Storage
[ https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15012620#comment-15012620 ] Sangjin Lee commented on YARN-4053: --- Yes, I spoke with [~jrottinghuis] regarding the stronger type safety suggestion. I'm now more or less comfortable with the current version. The additional type safety you get is probably minimal, and can cause bit more headache as a side effect as Joep mentioned. Just for the record, I tinkered with this a little bit, and there seems to be a way to do this safely. The assumption is {{add()}} and {{compare()}} are invoked only on the return values of {{decodeValue()}}. That is currently the case, but it might not always be true. In any case, if that were always true, we could do {code} public interface ValueConverter { byte[] encodeValue(Object value) throws IOException; // this value is still object T decodeValue(byte[] bytes) throws IOException; } public interface NumericValueConverter extends ValueConverter, Comparator { T add(T num1, T num2); } public class LongConverter implements NumericValueConverter { ... } {code} This is basically tying the loop from {{decodeValue()}} and {{add()}} and {{compare()}}. Again, this is a mild suggestion at this point, and it works only if the assumption that {{add()}} and {{compare()}} work only on the result of {{decodeValue()}} is true. > Change the way metric values are stored in HBase Storage > > > Key: YARN-4053 > URL: https://issues.apache.org/jira/browse/YARN-4053 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4053-YARN-2928.01.patch, > YARN-4053-YARN-2928.02.patch, YARN-4053-feature-YARN-2928.03.patch, > YARN-4053-feature-YARN-2928.04.patch, YARN-4053-feature-YARN-2928.05.patch > > > Currently HBase implementation uses GenericObjectMapper to convert and store > values in backend HBase storage. This converts everything into a string > representation(ASCII/UTF-8 encoded byte array). > While this is fine in most cases, it does not quite serve our use case for > metrics. > So we need to decide how are we going to encode and decode metric values and > store them in HBase. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4053) Change the way metric values are stored in HBase Storage
[ https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15012595#comment-15012595 ] Joep Rottinghuis commented on YARN-4053: [~sjlee0] the ValueConverterImpl is a remnant from earlier implementation where the enum had anonymous inner classes in the enum values. That is no longer needed now and we can probably eliminate that indeed. If we go with the tightened type interface then we would be able to add only Longs to Longs etc. That by itself is probably not too bad, because the column would be defined as one or the other. However, in FlowScanner.processSummation this will be an issue. In that method we don't know what the type for each column is, so the sum member is defined as a Number to future proof potentially other conversions. This is why we made the conversion pluggable. The add method must accept two numbers and emit a number. We could still have the class accept any T extends Number and if needed use that T later in the class to ensure that number implements T (or else throw an exception) if we want to implement the restriction that Vrushali mentions. The add method would still accept Number in the signature I think. This did make me think of another thing: right now we add only two numbers, never more than two. If at some point we want to add three or more Numbers we would have do to that one at a time. We could change the signature of NumericConverter from {code} public Number add(Number num1, Number num2); {code} to {code} public Number add(Number... numbers); {code} The calling code wouldn't have to change because it passes two numbers, but then a possible future use-case could add three or more arguments together. The implementation would change slightly to check for null and return null, or iterate over the list and return the sum of all values (that aren't null). That is probably more a matter of which style you like better, because right now we don't have this use-case. > Change the way metric values are stored in HBase Storage > > > Key: YARN-4053 > URL: https://issues.apache.org/jira/browse/YARN-4053 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4053-YARN-2928.01.patch, > YARN-4053-YARN-2928.02.patch, YARN-4053-feature-YARN-2928.03.patch, > YARN-4053-feature-YARN-2928.04.patch, YARN-4053-feature-YARN-2928.05.patch > > > Currently HBase implementation uses GenericObjectMapper to convert and store > values in backend HBase storage. This converts everything into a string > representation(ASCII/UTF-8 encoded byte array). > While this is fine in most cases, it does not quite serve our use case for > metrics. > So we need to decide how are we going to encode and decode metric values and > store them in HBase. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4053) Change the way metric values are stored in HBase Storage
[ https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15012508#comment-15012508 ] Sangjin Lee commented on YARN-4053: --- Sorry, I should have double checked. It would be more like {code} public interface NumberValueConverter extends ValueConverter, Comparator { T add(T num1, T num2); } {code} > Change the way metric values are stored in HBase Storage > > > Key: YARN-4053 > URL: https://issues.apache.org/jira/browse/YARN-4053 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4053-YARN-2928.01.patch, > YARN-4053-YARN-2928.02.patch, YARN-4053-feature-YARN-2928.03.patch, > YARN-4053-feature-YARN-2928.04.patch, YARN-4053-feature-YARN-2928.05.patch > > > Currently HBase implementation uses GenericObjectMapper to convert and store > values in backend HBase storage. This converts everything into a string > representation(ASCII/UTF-8 encoded byte array). > While this is fine in most cases, it does not quite serve our use case for > metrics. > So we need to decide how are we going to encode and decode metric values and > store them in HBase. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4053) Change the way metric values are stored in HBase Storage
[ https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15012422#comment-15012422 ] Vrushali C commented on YARN-4053: -- Alright, [~sjlee0] explained to me that there are writes coming via REST, so it is okay to have the isIntegralValue check in there. Thanks [~sjlee0]! > Change the way metric values are stored in HBase Storage > > > Key: YARN-4053 > URL: https://issues.apache.org/jira/browse/YARN-4053 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4053-YARN-2928.01.patch, > YARN-4053-YARN-2928.02.patch, YARN-4053-feature-YARN-2928.03.patch, > YARN-4053-feature-YARN-2928.04.patch, YARN-4053-feature-YARN-2928.05.patch > > > Currently HBase implementation uses GenericObjectMapper to convert and store > values in backend HBase storage. This converts everything into a string > representation(ASCII/UTF-8 encoded byte array). > While this is fine in most cases, it does not quite serve our use case for > metrics. > So we need to decide how are we going to encode and decode metric values and > store them in HBase. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4053) Change the way metric values are stored in HBase Storage
[ https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15012313#comment-15012313 ] Vrushali C commented on YARN-4053: -- I thought that was only for client side read queries. Is the REST API layer being invoked for writes as well? So each container makes a rest call to the NM when it has to write a metric? I am wondering if that will be very slow? > Change the way metric values are stored in HBase Storage > > > Key: YARN-4053 > URL: https://issues.apache.org/jira/browse/YARN-4053 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4053-YARN-2928.01.patch, > YARN-4053-YARN-2928.02.patch, YARN-4053-feature-YARN-2928.03.patch, > YARN-4053-feature-YARN-2928.04.patch, YARN-4053-feature-YARN-2928.05.patch > > > Currently HBase implementation uses GenericObjectMapper to convert and store > values in backend HBase storage. This converts everything into a string > representation(ASCII/UTF-8 encoded byte array). > While this is fine in most cases, it does not quite serve our use case for > metrics. > So we need to decide how are we going to encode and decode metric values and > store them in HBase. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4053) Change the way metric values are stored in HBase Storage
[ https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15012244#comment-15012244 ] Sangjin Lee commented on YARN-4053: --- Is it the same question as what Varun answered earlier? bq. JAX-RS i.e. the REST API layer will convert an integral value to Integer automatically if its less than Integer.MAX_VALUE so I guess we will have to handle ints and shorts as well i.e. if its an Integer for instance, we can call Integer#longValue to convert it to long. > Change the way metric values are stored in HBase Storage > > > Key: YARN-4053 > URL: https://issues.apache.org/jira/browse/YARN-4053 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4053-YARN-2928.01.patch, > YARN-4053-YARN-2928.02.patch, YARN-4053-feature-YARN-2928.03.patch, > YARN-4053-feature-YARN-2928.04.patch, YARN-4053-feature-YARN-2928.05.patch > > > Currently HBase implementation uses GenericObjectMapper to convert and store > values in backend HBase storage. This converts everything into a string > representation(ASCII/UTF-8 encoded byte array). > While this is fine in most cases, it does not quite serve our use case for > metrics. > So we need to decide how are we going to encode and decode metric values and > store them in HBase. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4053) Change the way metric values are stored in HBase Storage
[ https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15012208#comment-15012208 ] Vrushali C commented on YARN-4053: -- Thanks [~varun_saxena] for the patch! It looks good to me except for the following question. One point I recollect was that we wanted to accept only longs while encoding. In patch v4 I see LongConverter #encodeValue() accept any integral value and encode it as Longs. Am wondering if we should accept only longs? That is, modify the isIntegralValue check to isLongValue? cc [~jrottinghuis] > Change the way metric values are stored in HBase Storage > > > Key: YARN-4053 > URL: https://issues.apache.org/jira/browse/YARN-4053 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4053-YARN-2928.01.patch, > YARN-4053-YARN-2928.02.patch, YARN-4053-feature-YARN-2928.03.patch, > YARN-4053-feature-YARN-2928.04.patch, YARN-4053-feature-YARN-2928.05.patch > > > Currently HBase implementation uses GenericObjectMapper to convert and store > values in backend HBase storage. This converts everything into a string > representation(ASCII/UTF-8 encoded byte array). > While this is fine in most cases, it does not quite serve our use case for > metrics. > So we need to decide how are we going to encode and decode metric values and > store them in HBase. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4053) Change the way metric values are stored in HBase Storage
[ https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15012185#comment-15012185 ] Sangjin Lee commented on YARN-4053: --- Sorry it took me awhile to review the latest patch. It looks good for the most part. I have a couple of minor comments. - The name {{ValueConverterImpl}} threw me off a little because initially I thought it was about actual implementations of {{ValueConverter}}. It seems more like a {{ValueConverterFactory}}. I think we can go straight to {{GenericConverter.getIstance()}} and {{LongConverter.getInstance()}}. The factory seems like a little overkill. - Can we change {{NumericValueConverter}} to use generics; e.g. {code} public interface NumericValueConverter extends ValueConverter, Comparator { {code} ? This can be even tighter than the current interface. > Change the way metric values are stored in HBase Storage > > > Key: YARN-4053 > URL: https://issues.apache.org/jira/browse/YARN-4053 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4053-YARN-2928.01.patch, > YARN-4053-YARN-2928.02.patch, YARN-4053-feature-YARN-2928.03.patch, > YARN-4053-feature-YARN-2928.04.patch, YARN-4053-feature-YARN-2928.05.patch > > > Currently HBase implementation uses GenericObjectMapper to convert and store > values in backend HBase storage. This converts everything into a string > representation(ASCII/UTF-8 encoded byte array). > While this is fine in most cases, it does not quite serve our use case for > metrics. > So we need to decide how are we going to encode and decode metric values and > store them in HBase. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4053) Change the way metric values are stored in HBase Storage
[ https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15011789#comment-15011789 ] Joep Rottinghuis commented on YARN-4053: Thanks [~varun_saxena] patch looks good to me. Thanks for the updates! [~sjlee0] and [~vrushalic] said they'll look at this patch by end of day today as well. > Change the way metric values are stored in HBase Storage > > > Key: YARN-4053 > URL: https://issues.apache.org/jira/browse/YARN-4053 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4053-YARN-2928.01.patch, > YARN-4053-YARN-2928.02.patch, YARN-4053-feature-YARN-2928.03.patch, > YARN-4053-feature-YARN-2928.04.patch, YARN-4053-feature-YARN-2928.05.patch > > > Currently HBase implementation uses GenericObjectMapper to convert and store > values in backend HBase storage. This converts everything into a string > representation(ASCII/UTF-8 encoded byte array). > While this is fine in most cases, it does not quite serve our use case for > metrics. > So we need to decide how are we going to encode and decode metric values and > store them in HBase. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4053) Change the way metric values are stored in HBase Storage
[ https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15011590#comment-15011590 ] Hadoop QA commented on YARN-4053: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 7s {color} | {color:blue} docker + precommit patch detected. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 2 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 25s {color} | {color:green} feature-YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 21s {color} | {color:green} feature-YARN-2928 passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 22s {color} | {color:green} feature-YARN-2928 passed with JDK v1.7.0_85 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 15s {color} | {color:green} feature-YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 32s {color} | {color:green} feature-YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 24s {color} | {color:green} feature-YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 48s {color} | {color:green} feature-YARN-2928 passed {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 20s {color} | {color:red} hadoop-yarn-server-timelineservice in feature-YARN-2928 failed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 22s {color} | {color:green} feature-YARN-2928 passed with JDK v1.7.0_85 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 24s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 20s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 20s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 22s {color} | {color:green} the patch passed with JDK v1.7.0_85 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 22s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 12s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 30s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 18s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 50s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 18s {color} | {color:red} hadoop-yarn-server-timelineservice in the patch failed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 20s {color} | {color:green} the patch passed with JDK v1.7.0_85 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 23s {color} | {color:green} hadoop-yarn-server-timelineservice in the patch passed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 26s {color} | {color:green} hadoop-yarn-server-timelineservice in the patch passed with JDK v1.7.0_85. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 31s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 31m 5s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:date2015-11-18 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12773033/YARN-4053-feature-YARN-2928.05.patch | | JIRA Issue | YARN-4053 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 59b383bea07a 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_
[jira] [Commented] (YARN-4053) Change the way metric values are stored in HBase Storage
[ https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15009162#comment-15009162 ] Joep Rottinghuis commented on YARN-4053: Either throw exception, or document in javadoc that null is considered 0. For addition that would be reasonable, but either approach is fine (as long as the javadoc explains which approach was taken). For comparison I suppose only null == null is reasonable. For null compared to anything else you can either assume 0 or throw an exception. Not sure that one can say that null is bigger or smaller than any other value. Wrt comparable versus comparator you're correct. We're not comparing the ValueConverters themselves, but rather Numbers. > Change the way metric values are stored in HBase Storage > > > Key: YARN-4053 > URL: https://issues.apache.org/jira/browse/YARN-4053 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4053-YARN-2928.01.patch, > YARN-4053-YARN-2928.02.patch, YARN-4053-feature-YARN-2928.03.patch, > YARN-4053-feature-YARN-2928.04.patch > > > Currently HBase implementation uses GenericObjectMapper to convert and store > values in backend HBase storage. This converts everything into a string > representation(ASCII/UTF-8 encoded byte array). > While this is fine in most cases, it does not quite serve our use case for > metrics. > So we need to decide how are we going to encode and decode metric values and > store them in HBase. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4053) Change the way metric values are stored in HBase Storage
[ https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15009115#comment-15009115 ] Varun Saxena commented on YARN-4053: bq. public interface NumericValueConverter extends ValueConverter, Comparable ? I guess you mean Comparator. Yes we can have that. bq. LongConverter.compare and LongConverter.add should probably handle null values. Throw an exception if either is null ? > Change the way metric values are stored in HBase Storage > > > Key: YARN-4053 > URL: https://issues.apache.org/jira/browse/YARN-4053 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4053-YARN-2928.01.patch, > YARN-4053-YARN-2928.02.patch, YARN-4053-feature-YARN-2928.03.patch, > YARN-4053-feature-YARN-2928.04.patch > > > Currently HBase implementation uses GenericObjectMapper to convert and store > values in backend HBase storage. This converts everything into a string > representation(ASCII/UTF-8 encoded byte array). > While this is fine in most cases, it does not quite serve our use case for > metrics. > So we need to decide how are we going to encode and decode metric values and > store them in HBase. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4053) Change the way metric values are stored in HBase Storage
[ https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15008041#comment-15008041 ] Joep Rottinghuis commented on YARN-4053: Looks good [~varun_saxena] this is a nice separation of the conversion and the numeric conversion / comparison and general numeric manipulation. Nit: LongConverter.compare and LongConverter.add should probably handle null values. Question any reason you don't simply have public interface NumericValueConverter extends ValueConverter, Comparable ? > Change the way metric values are stored in HBase Storage > > > Key: YARN-4053 > URL: https://issues.apache.org/jira/browse/YARN-4053 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4053-YARN-2928.01.patch, > YARN-4053-YARN-2928.02.patch, YARN-4053-feature-YARN-2928.03.patch, > YARN-4053-feature-YARN-2928.04.patch > > > Currently HBase implementation uses GenericObjectMapper to convert and store > values in backend HBase storage. This converts everything into a string > representation(ASCII/UTF-8 encoded byte array). > While this is fine in most cases, it does not quite serve our use case for > metrics. > So we need to decide how are we going to encode and decode metric values and > store them in HBase. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4053) Change the way metric values are stored in HBase Storage
[ https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15007358#comment-15007358 ] Hadoop QA commented on YARN-4053: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 6s {color} | {color:blue} docker + precommit patch detected. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 2 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 9m 8s {color} | {color:green} feature-YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 15s {color} | {color:green} feature-YARN-2928 passed with JDK v1.8.0_60 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 16s {color} | {color:green} feature-YARN-2928 passed with JDK v1.7.0_79 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 12s {color} | {color:green} feature-YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 25s {color} | {color:green} feature-YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 21s {color} | {color:green} feature-YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 38s {color} | {color:green} feature-YARN-2928 passed {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 17s {color} | {color:red} hadoop-yarn-server-timelineservice in feature-YARN-2928 failed with JDK v1.8.0_60. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 17s {color} | {color:green} feature-YARN-2928 passed with JDK v1.7.0_79 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 21s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 14s {color} | {color:green} the patch passed with JDK v1.8.0_60 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 14s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 16s {color} | {color:green} the patch passed with JDK v1.7.0_79 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 16s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 11s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 23s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 15s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 41s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 15s {color} | {color:red} hadoop-yarn-server-timelineservice in the patch failed with JDK v1.8.0_60. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 16s {color} | {color:green} the patch passed with JDK v1.7.0_79 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 35s {color} | {color:green} hadoop-yarn-server-timelineservice in the patch passed with JDK v1.8.0_60. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 48s {color} | {color:green} hadoop-yarn-server-timelineservice in the patch passed with JDK v1.7.0_79. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 24s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 21m 36s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=1.7.1 Server=1.7.1 Image:test-patch-base-hadoop-date2015-11-16 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12772577/YARN-4053-feature-YARN-2928.04.patch | | JIRA Issue | YARN-4053 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 2f046c7ddf23 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMP
[jira] [Commented] (YARN-4053) Change the way metric values are stored in HBase Storage
[ https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15004887#comment-15004887 ] Joep Rottinghuis commented on YARN-4053: In further discussion with [~sjlee0] it may actually be that the current implementation of the FlowScanner doesn't properly deal with mixed numeric and non-numeric columns. Perhaps that is a separate jira to properly deal with that. I think what we may have to do is to ensure that we normally return any column that is non-numeric untouched. Sangjin suggested that perhaps the new methods I mentioned (comparator, and sum) make sense only for numeric types. That can be more cleanly implemented as a sub-interface. Then the FlowScanner can determine if the returned converter is numeric. If so, it can process as is with collapsing values, or else it would simply leave the cells untouched. That way we could mix numeric and non-numeric columns in one column family and we avoid having to implement a sum or a meaningless comparison between unrelated objects. > Change the way metric values are stored in HBase Storage > > > Key: YARN-4053 > URL: https://issues.apache.org/jira/browse/YARN-4053 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4053-YARN-2928.01.patch, > YARN-4053-YARN-2928.02.patch, YARN-4053-feature-YARN-2928.03.patch > > > Currently HBase implementation uses GenericObjectMapper to convert and store > values in backend HBase storage. This converts everything into a string > representation(ASCII/UTF-8 encoded byte array). > While this is fine in most cases, it does not quite serve our use case for > metrics. > So we need to decide how are we going to encode and decode metric values and > store them in HBase. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4053) Change the way metric values are stored in HBase Storage
[ https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15004842#comment-15004842 ] Joep Rottinghuis commented on YARN-4053: Now that I'm looking at it again in detail with [~vrushalic] the perf impact is probably not quite as bad. I thought for a moment that the iteration had to happen for each cell, but it is needed for each column only. I agree with your point that we should keep it generic and be future proof. If we have the converter, then we should just use it. However, in FlowScanner.compareCellValues (line 357) you're assuming a long and just blindly cast to long. If this happens to be a GenericObjectMapper then that will fail. Perhaps we should do the following: make converter extend comparable and also add a method to sum two (or more) byte values in a vararg style. For the long implementation that should simply be a regular sum, for the generic object mapper it may be a string append, or whatever makes sense. Then we're future proof when another style of conversion and summing and comparison can happen safely right? Would that make sense? > Change the way metric values are stored in HBase Storage > > > Key: YARN-4053 > URL: https://issues.apache.org/jira/browse/YARN-4053 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4053-YARN-2928.01.patch, > YARN-4053-YARN-2928.02.patch, YARN-4053-feature-YARN-2928.03.patch > > > Currently HBase implementation uses GenericObjectMapper to convert and store > values in backend HBase storage. This converts everything into a string > representation(ASCII/UTF-8 encoded byte array). > While this is fine in most cases, it does not quite serve our use case for > metrics. > So we need to decide how are we going to encode and decode metric values and > store them in HBase. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4053) Change the way metric values are stored in HBase Storage
[ https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15004727#comment-15004727 ] Varun Saxena commented on YARN-4053: bq. Would it be safe to always assume a long converter? I think while processing it would be safe to assume with current columns and converters. The issue comes if somebody changes the converter corresponding to column/column prefix in future. And possibly forgets to change here. Arguably though its difficult to miss such a change after testing. The reason this iterating over all columns/column prefixes was added was to handle future column additions or change in converter for existing columns. Getting the converter on every iteration was a concern even for me. That's why I am doing it only when column qualifier changes while iterating cells. Not sure if we can get hold of all possible column qualifiers(and hence the converter) for this scan in the constructor. Another option would be to use cell tags to identify converters. > Change the way metric values are stored in HBase Storage > > > Key: YARN-4053 > URL: https://issues.apache.org/jira/browse/YARN-4053 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4053-YARN-2928.01.patch, > YARN-4053-YARN-2928.02.patch, YARN-4053-feature-YARN-2928.03.patch > > > Currently HBase implementation uses GenericObjectMapper to convert and store > values in backend HBase storage. This converts everything into a string > representation(ASCII/UTF-8 encoded byte array). > While this is fine in most cases, it does not quite serve our use case for > metrics. > So we need to decide how are we going to encode and decode metric values and > store them in HBase. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4053) Change the way metric values are stored in HBase Storage
[ https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15004550#comment-15004550 ] Joep Rottinghuis commented on YARN-4053: Patch looks good. The converter approach looks nice and clean. One thing I'm wondering if we need to look up the converter each time in FlowScanner line 180 where we do getValueConverter each time in nextInternal. I'm wondering if we can somehow avoid this. Would it be safe to always assume a long converter? What happens if we did end up mixing a long and a non-long converter? Perhaps that is not a safe assumption. Would it be possible to do this once and not each time? I'll chat with [~vrushalic] if we can somehow get around iterating over every column for each cell. Perhaps we can capture this during the constructor. > Change the way metric values are stored in HBase Storage > > > Key: YARN-4053 > URL: https://issues.apache.org/jira/browse/YARN-4053 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4053-YARN-2928.01.patch, > YARN-4053-YARN-2928.02.patch, YARN-4053-feature-YARN-2928.03.patch > > > Currently HBase implementation uses GenericObjectMapper to convert and store > values in backend HBase storage. This converts everything into a string > representation(ASCII/UTF-8 encoded byte array). > While this is fine in most cases, it does not quite serve our use case for > metrics. > So we need to decide how are we going to encode and decode metric values and > store them in HBase. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4053) Change the way metric values are stored in HBase Storage
[ https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15003690#comment-15003690 ] Varun Saxena commented on YARN-4053: Updating a renamed patch file so that QA build can run. > Change the way metric values are stored in HBase Storage > > > Key: YARN-4053 > URL: https://issues.apache.org/jira/browse/YARN-4053 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Attachments: YARN-4053-YARN-2928.01.patch, > YARN-4053-YARN-2928.02.patch, YARN-4053-feature-YARN-2928.03.patch > > > Currently HBase implementation uses GenericObjectMapper to convert and store > values in backend HBase storage. This converts everything into a string > representation(ASCII/UTF-8 encoded byte array). > While this is fine in most cases, it does not quite serve our use case for > metrics. > So we need to decide how are we going to encode and decode metric values and > store them in HBase. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4053) Change the way metric values are stored in HBase Storage
[ https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15003688#comment-15003688 ] Hadoop QA commented on YARN-4053: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 12s {color} | {color:blue} docker + precommit patch detected. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 2 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 10m 10s {color} | {color:green} feature-YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 18s {color} | {color:green} feature-YARN-2928 passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 19s {color} | {color:green} feature-YARN-2928 passed with JDK v1.7.0_79 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 15s {color} | {color:green} feature-YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 29s {color} | {color:green} feature-YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 25s {color} | {color:green} feature-YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 43s {color} | {color:green} feature-YARN-2928 passed {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 21s {color} | {color:red} hadoop-yarn-server-timelineservice in feature-YARN-2928 failed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 20s {color} | {color:green} feature-YARN-2928 passed with JDK v1.7.0_79 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 22s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 18s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 18s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 18s {color} | {color:green} the patch passed with JDK v1.7.0_79 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 18s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 13s {color} | {color:red} Patch generated 4 new checkstyle issues in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice (total was 41, now 45). {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 26s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 18s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 49s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 19s {color} | {color:red} hadoop-yarn-server-timelineservice in the patch failed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 20s {color} | {color:green} the patch passed with JDK v1.7.0_79 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 43s {color} | {color:green} hadoop-yarn-server-timelineservice in the patch passed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 41s {color} | {color:green} hadoop-yarn-server-timelineservice in the patch passed with JDK v1.7.0_79. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 27s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 25m 53s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=1.7.0 Server=1.7.0 Image:test-patch-base-hadoop-date2015-11-13 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12772138/YARN-4053-feature-YARN-2928.03.patch | | JIRA Issue | YARN-4053 | | Optional Tests | asflicense compile javac
[jira] [Commented] (YARN-4053) Change the way metric values are stored in HBase Storage
[ https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14999871#comment-14999871 ] Hadoop QA commented on YARN-4053: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 6s {color} | {color:blue} docker + precommit patch detected. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 2 new or modified test files. {color} | | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 5m 31s {color} | {color:red} root in YARN-2928 failed. {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 1m 31s {color} | {color:red} hadoop-yarn-server-timelineservice in YARN-2928 failed with JDK v1.8.0_60. {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 12s {color} | {color:red} hadoop-yarn-server-timelineservice in YARN-2928 failed with JDK v1.7.0_79. {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 12s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 37s {color} | {color:green} YARN-2928 passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 12s {color} | {color:red} hadoop-yarn-server-timelineservice in YARN-2928 failed. {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 13s {color} | {color:red} hadoop-yarn-server-timelineservice in YARN-2928 failed with JDK v1.8.0_60. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 14s {color} | {color:green} YARN-2928 passed with JDK v1.7.0_79 {color} | | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 11s {color} | {color:red} hadoop-yarn-server-timelineservice in the patch failed. {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 9s {color} | {color:red} hadoop-yarn-server-timelineservice in the patch failed with JDK v1.8.0_60. {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 9s {color} | {color:red} hadoop-yarn-server-timelineservice in the patch failed with JDK v1.8.0_60. {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 12s {color} | {color:red} hadoop-yarn-server-timelineservice in the patch failed with JDK v1.7.0_79. {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 12s {color} | {color:red} hadoop-yarn-server-timelineservice in the patch failed with JDK v1.7.0_79. {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 10s {color} | {color:red} Patch generated 3 new checkstyle issues in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice (total was 41, now 44). {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 12s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 1s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 12s {color} | {color:red} hadoop-yarn-server-timelineservice in the patch failed. {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 12s {color} | {color:red} hadoop-yarn-server-timelineservice in the patch failed with JDK v1.8.0_60. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 13s {color} | {color:green} the patch passed with JDK v1.7.0_79 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 9s {color} | {color:red} hadoop-yarn-server-timelineservice in the patch failed with JDK v1.8.0_60. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 11s {color} | {color:red} hadoop-yarn-server-timelineservice in the patch failed with JDK v1.7.0_79. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 16s {color} | {color:red} Patch generated 1 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 11m 37s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=1.7.1 Server=1.7.1 Image:test-patch-base-hadoop-date2015-11-11 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12771679/YARN-4053-YARN-2928.03.patch | | JIRA Issue | YARN-4053 | | Optional Tests | asflicense javac javadoc mvninstall unit findbugs checkstyle compile |
[jira] [Commented] (YARN-4053) Change the way metric values are stored in HBase Storage
[ https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14999841#comment-14999841 ] Varun Saxena commented on YARN-4053: Moreover, whether to aggregate or not, the proposal is to not have it in a column qualifier. So nothing to do here for that. YARN-3816 will have to remove code corresponding to it. > Change the way metric values are stored in HBase Storage > > > Key: YARN-4053 > URL: https://issues.apache.org/jira/browse/YARN-4053 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Attachments: YARN-4053-YARN-2928.01.patch, > YARN-4053-YARN-2928.02.patch, YARN-4053-YARN-2928.03.patch > > > Currently HBase implementation uses GenericObjectMapper to convert and store > values in backend HBase storage. This converts everything into a string > representation(ASCII/UTF-8 encoded byte array). > While this is fine in most cases, it does not quite serve our use case for > metrics. > So we need to decide how are we going to encode and decode metric values and > store them in HBase. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4053) Change the way metric values are stored in HBase Storage
[ https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14999837#comment-14999837 ] Varun Saxena commented on YARN-4053: Attached a new patch addressing points above. Added a ValueConverter interface and a ValueConverterImpl enum which contains GENERIC and LONG implementations. In FlowScanner, will have to iterate over all the available column prefixes and columns to get hold of the right converter. Haven't addressed TIME_SERIES related point as of now. Can have it in the next patch once a consensus is reached for the implementation.. Functionally speaking, over the last patch I am now storing min start and max end time as longs as well. > Change the way metric values are stored in HBase Storage > > > Key: YARN-4053 > URL: https://issues.apache.org/jira/browse/YARN-4053 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Attachments: YARN-4053-YARN-2928.01.patch, > YARN-4053-YARN-2928.02.patch, YARN-4053-YARN-2928.03.patch > > > Currently HBase implementation uses GenericObjectMapper to convert and store > values in backend HBase storage. This converts everything into a string > representation(ASCII/UTF-8 encoded byte array). > While this is fine in most cases, it does not quite serve our use case for > metrics. > So we need to decide how are we going to encode and decode metric values and > store them in HBase. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4053) Change the way metric values are stored in HBase Storage
[ https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14999822#comment-14999822 ] Vrushali C commented on YARN-4053: -- bq. Vrushali, thanks for your comments. I would like to work on this. Let me take a stab on this one. Will have the bandwidth. I hope its fine. You can help me with the reviews. Sounds good, let me go through the discussion points you have mentioned and get back on this. > Change the way metric values are stored in HBase Storage > > > Key: YARN-4053 > URL: https://issues.apache.org/jira/browse/YARN-4053 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Attachments: YARN-4053-YARN-2928.01.patch, > YARN-4053-YARN-2928.02.patch > > > Currently HBase implementation uses GenericObjectMapper to convert and store > values in backend HBase storage. This converts everything into a string > representation(ASCII/UTF-8 encoded byte array). > While this is fine in most cases, it does not quite serve our use case for > metrics. > So we need to decide how are we going to encode and decode metric values and > store them in HBase. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4053) Change the way metric values are stored in HBase Storage
[ https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14999780#comment-14999780 ] Varun Saxena commented on YARN-4053: Ok > Change the way metric values are stored in HBase Storage > > > Key: YARN-4053 > URL: https://issues.apache.org/jira/browse/YARN-4053 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Attachments: YARN-4053-YARN-2928.01.patch, > YARN-4053-YARN-2928.02.patch > > > Currently HBase implementation uses GenericObjectMapper to convert and store > values in backend HBase storage. This converts everything into a string > representation(ASCII/UTF-8 encoded byte array). > While this is fine in most cases, it does not quite serve our use case for > metrics. > So we need to decide how are we going to encode and decode metric values and > store them in HBase. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4053) Change the way metric values are stored in HBase Storage
[ https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14999660#comment-14999660 ] Sangjin Lee commented on YARN-4053: --- To make progress with this ticket, if you're in line with what Vrushali said above, we can focus on implementing the correct long support in this ticket. We don't have to worry about the other dimensions (whether to aggregate, or single-value v. time series) in here. > Change the way metric values are stored in HBase Storage > > > Key: YARN-4053 > URL: https://issues.apache.org/jira/browse/YARN-4053 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Attachments: YARN-4053-YARN-2928.01.patch, > YARN-4053-YARN-2928.02.patch > > > Currently HBase implementation uses GenericObjectMapper to convert and store > values in backend HBase storage. This converts everything into a string > representation(ASCII/UTF-8 encoded byte array). > While this is fine in most cases, it does not quite serve our use case for > metrics. > So we need to decide how are we going to encode and decode metric values and > store them in HBase. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4053) Change the way metric values are stored in HBase Storage
[ https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14997121#comment-14997121 ] Varun Saxena commented on YARN-4053: Vrushali, thanks for your comments. I would like to work on this. Let me take a stab on this one. Will have the bandwidth. I hope its fine. You can help me with the reviews. Coming to the points, I agree that flag is not good for extensibility. As I said earlier, flag should be fine for now as we have only 2 choices(generic or long) and we can extend later. But eventually will have to have different handlers for different types. So why not do it now. Hence, lets go with proposal above. Moreover, yes, we need to have proper handling based on data type or conversion mechanism in FlowScanner too. As mentioned in an earlier comment, I was thinking we can indicate this in attributes. But I guess your proposal sounds better. We can identify the column/column prefix in flow scanner as well and convert based on the converter attached to it. bq. it missed one of the places in the current patch for example Which place ? MIN/MAX handling ? bq. For single value vs time series, we suggest using a column prefix to distinguish them Do we need to have a differentiation between SINGLE_VALUE and TIME_SERIES if by default it will be read as SINGLE_VALUE ? Because we will be storing multiple values even for metric of type SINGLE_VALUE. Do you mean on the read side, only the latest value of a metric is to be returned if its of type SINGLE_VALUE (even if client asks for TIME_SERIES) ? Again the assumption here is that client will always send the metric type(SINGLE_VALUE or TIME_SERIES) consistently. bq. For the read path, we can assume it is a single value unless specifically specified by the client as a time series (as clients would need to intend to read time series explicitly). We can return TIME_SERIES by indicating something like METRICS_TIME_SERIES as fields. If we do so, it will have implications on YARN-3862. Now the question is whether to return values for multiple timestamps even for metric type of SINGLE_VALUE if client asks for it ? What if client wants to see values of a gauge(which might be considered as a SINGLE_VALUE) over a period of time, for instance. If yes, do we need to even differentiate between the 2 types ? bq. We finally concluded that we should start with storing longs only and make the code strictly accept longs JAX-RS i.e. the REST API layer will convert an integral value to Integer automatically if its less than Integer.MAX_VALUE so I guess we will have to handle ints and shorts as well i.e. if its an Integer for instance, we can call Integer#longValue to convert it to long. bq. Regarding indicating whether to aggregate or not, we suggest to rely mostly on the flow run aggregation. For those use cases that need to access metrics off of tables other than the flow run table (e.g. time-based aggregation), we need to explore ways to specify this information as input (config, etc.) I hope Li Lu is fine with this because I remember him saying on YARN-3816 that he will be using it for offline aggregation in YARN-3817. I think rows from application table are being used in the MR job there. Are you suggesting that for offline aggregation, based on config, we aggregate all the application metrics(to flow or user) or nothing ? Or configure a set of metrics to aggregate in some config ? > Change the way metric values are stored in HBase Storage > > > Key: YARN-4053 > URL: https://issues.apache.org/jira/browse/YARN-4053 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Attachments: YARN-4053-YARN-2928.01.patch, > YARN-4053-YARN-2928.02.patch > > > Currently HBase implementation uses GenericObjectMapper to convert and store > values in backend HBase storage. This converts everything into a string > representation(ASCII/UTF-8 encoded byte array). > While this is fine in most cases, it does not quite serve our use case for > metrics. > So we need to decide how are we going to encode and decode metric values and > store them in HBase. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4053) Change the way metric values are stored in HBase Storage
[ https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14994603#comment-14994603 ] Vrushali C commented on YARN-4053: -- Thanks [~varun_saxena] for the patch and [~djp] , [~gtCarrera], [~Naganarasimha], [~sjlee0] and [~jrottinghuis] for the discussion so far! [~jrottinghuis] , [~sjlee0] and I had an offline discussion on this yesterday. We discussed at length along the following vectors: - metric datatype: long, double, either or, both? - metric type storage and retrieval for: single values vs timeseries - metrics in the context of aggregation: how to indicate whether to aggregate or no. - operations on metrics: sum vs average, min/max To summarize the discussion: - Our proposal is to proceed with supporting only longs for now. We went over several situations of how to store and query for decimal numbers: as Doubles or as numerator/denominator, how to use filters while scanning for such stored values, how would aggregation look at it etc. We thought about which metrics are to be stored as Doubles and how the precision might affect aggregation. We finally concluded that we should start with storing longs only and make the code strictly accept longs (not even ints or shorts). - For single value vs time series, we suggest using a column prefix to distinguish them. For the read path, we can assume it is a single value unless specifically specified by the client as a time series (as clients would need to intend to read time series explicitly). - Regarding indicating whether to aggregate or not, we suggest to rely mostly on the flow run aggregation. For those use cases that need to access metrics off of tables other than the flow run table (e.g. time-based aggregation), we need to explore ways to specify this information as input (config, etc.) - So, the current patch is along the lines of our proposal of using longs for metrics. But we are considering a different approach of creating a "converter" type and implementation. For other non metric columns, a "generic" converter that uses the GenericObjectMapper can be created and used implicitly. For the numeric (long) columns, a long converter would be used explicitly. We also need to revisit how it's done in FlowScanner (it missed one of the places in the current patch for example). We need to get at the instances of ColumnPrefix and ColumnFamily, etc. and use them to get the converter in the flow scanner. @Varun Would it be fine if I took over this jira to patch it with the above points? thanks Vrushali > Change the way metric values are stored in HBase Storage > > > Key: YARN-4053 > URL: https://issues.apache.org/jira/browse/YARN-4053 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Attachments: YARN-4053-YARN-2928.01.patch, > YARN-4053-YARN-2928.02.patch > > > Currently HBase implementation uses GenericObjectMapper to convert and store > values in backend HBase storage. This converts everything into a string > representation(ASCII/UTF-8 encoded byte array). > While this is fine in most cases, it does not quite serve our use case for > metrics. > So we need to decide how are we going to encode and decode metric values and > store them in HBase. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4053) Change the way metric values are stored in HBase Storage
[ https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14974623#comment-14974623 ] Varun Saxena commented on YARN-4053: [~sjlee0], [~djp], [~vrushalic] and others, kindly review. In this patch I simply assume that we will handle all metrics as longs. When AggregationOperation is SUM(which is added as a cell tag/attribute), on the coprocessor side I assume this is for metrics because this is what it is meant for as of now. If we change this for something else, we can change the tag to indicate if this cell would contain a long too. Moreover, I have added a simple flag in ColumnHelper to indicate value has to be encoded and decoded as a long. Once we add support for other metrics, maybe TimelineMetric should indicate data type of metric but that is not being done for now so a flag(the approach adopted in patch) should do, as of now. Now there were a few other things which have to be handled as part of this patch but we have not yet reached consensus on or discussed them. # We need to decide how to indicate if a metric is to be aggregated or not. Currently this kept part of column qualifier in YARN-3816. We can continue with that I guess. As MR job run for offline aggregation would need this info as well. # Decide which metric is a TIME_SERIES and which one is SINGLE_VALUE(get only the latest value). Should we use tags for it and attach coprocessor with every table storing metrics ? Performance implications if that's done ? > Change the way metric values are stored in HBase Storage > > > Key: YARN-4053 > URL: https://issues.apache.org/jira/browse/YARN-4053 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Attachments: YARN-4053-YARN-2928.01.patch, > YARN-4053-YARN-2928.02.patch > > > Currently HBase implementation uses GenericObjectMapper to convert and store > values in backend HBase storage. This converts everything into a string > representation(ASCII/UTF-8 encoded byte array). > While this is fine in most cases, it does not quite serve our use case for > metrics. > So we need to decide how are we going to encode and decode metric values and > store them in HBase. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4053) Change the way metric values are stored in HBase Storage
[ https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14974100#comment-14974100 ] Hadoop QA commented on YARN-4053: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | pre-patch | 16m 45s | Findbugs (version ) appears to be broken on YARN-2928. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 2 new or modified test files. | | {color:green}+1{color} | javac | 8m 10s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 10m 35s | There were no new javadoc warning messages. | | {color:red}-1{color} | release audit | 0m 18s | The applied patch generated 1 release audit warnings. | | {color:green}+1{color} | checkstyle | 0m 14s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 2s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 36s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 40s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 0m 52s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | yarn tests | 2m 55s | Tests passed in hadoop-yarn-server-timelineservice. | | | | 42m 13s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12768689/YARN-4053-YARN-2928.02.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | YARN-2928 / 3c4e424 | | Release Audit | https://builds.apache.org/job/PreCommit-YARN-Build/9575/artifact/patchprocess/patchReleaseAuditProblems.txt | | hadoop-yarn-server-timelineservice test log | https://builds.apache.org/job/PreCommit-YARN-Build/9575/artifact/patchprocess/testrun_hadoop-yarn-server-timelineservice.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/9575/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf902.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/9575/console | This message was automatically generated. > Change the way metric values are stored in HBase Storage > > > Key: YARN-4053 > URL: https://issues.apache.org/jira/browse/YARN-4053 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Attachments: YARN-4053-YARN-2928.01.patch, > YARN-4053-YARN-2928.02.patch > > > Currently HBase implementation uses GenericObjectMapper to convert and store > values in backend HBase storage. This converts everything into a string > representation(ASCII/UTF-8 encoded byte array). > While this is fine in most cases, it does not quite serve our use case for > metrics. > So we need to decide how are we going to encode and decode metric values and > store them in HBase. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4053) Change the way metric values are stored in HBase Storage
[ https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14933492#comment-14933492 ] Varun Saxena commented on YARN-4053: I think we need to revive this JIRA and decide its scope. I will be converting everything to longs as of now. And throw exceptions if its a floating point or BigInteger value. Other than that, we need to decide do we need to differentiate between TIME_SERIES and SINGLE_VALUE , indicate to client if a particular metric is an aggregated metric or not, etc. The challenge though is that we cannot use cell tags for anything which needs to be sent back to client as a Get/Scan cant get hold of tags on the client side. > Change the way metric values are stored in HBase Storage > > > Key: YARN-4053 > URL: https://issues.apache.org/jira/browse/YARN-4053 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Attachments: YARN-4053-YARN-2928.01.patch > > > Currently HBase implementation uses GenericObjectMapper to convert and store > values in backend HBase storage. This converts everything into a string > representation(ASCII/UTF-8 encoded byte array). > While this is fine in most cases, it does not quite serve our use case for > metrics. > So we need to decide how are we going to encode and decode metric values and > store them in HBase. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4053) Change the way metric values are stored in HBase Storage
[ https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14727816#comment-14727816 ] Joep Rottinghuis commented on YARN-4053: Part of the issue is that there are many moving components that all need to work together around properly handling longs and doubles. This is aggregation, store, read, column value queries, filters, etc. Tags have challenges pointed out before, and in addition, tags are filtered out on read (you will not get all the tag values, unless you do voodoo in a coprocessor). Since we're still figuring out how all these pieces work together in a consistent manner, our suggestion is to first make this all work together with longs only, and then after that figure out how to add support for double. Trying to make things work for both will bog us down right now and we'll likely not get things working consistently, we simply can't oversee all the consequences and impacts on aggregation, querying etc. > Change the way metric values are stored in HBase Storage > > > Key: YARN-4053 > URL: https://issues.apache.org/jira/browse/YARN-4053 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Attachments: YARN-4053-YARN-2928.01.patch > > > Currently HBase implementation uses GenericObjectMapper to convert and store > values in backend HBase storage. This converts everything into a string > representation(ASCII/UTF-8 encoded byte array). > While this is fine in most cases, it does not quite serve our use case for > metrics. > So we need to decide how are we going to encode and decode metric values and > store them in HBase. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4053) Change the way metric values are stored in HBase Storage
[ https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14727641#comment-14727641 ] Varun Saxena commented on YARN-4053: bq. to use cell tag identify metrics value type In YARN-3816, by metric value type I meant whether its aggregated metric or not. I guess that is what you were referring to. Cell tags can be useful that way. But identifying the metric's data type by cell tag wont work in our case because we have to apply metric filters. IMHO, I think forcing user to send consistent values is a fair enough expectation. If user doesnt send, they will get inconsistent results. If we go by this premise, I think we can go ahead with Longs and Doubles. Otherwise, we can assume all values as long. The point which Vrushali raised above regarding floating point precision not being that important. Well, as I point out in last Wednesday's meeting and the example which [~djp] gives, floating points may not matter for very large values but they would matter for metrics whose range is towards the lower end(say, 0 to 5). Now the real question is do we expect such metrics ? > Change the way metric values are stored in HBase Storage > > > Key: YARN-4053 > URL: https://issues.apache.org/jira/browse/YARN-4053 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Attachments: YARN-4053-YARN-2928.01.patch > > > Currently HBase implementation uses GenericObjectMapper to convert and store > values in backend HBase storage. This converts everything into a string > representation(ASCII/UTF-8 encoded byte array). > While this is fine in most cases, it does not quite serve our use case for > metrics. > So we need to decide how are we going to encode and decode metric values and > store them in HBase. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4053) Change the way metric values are stored in HBase Storage
[ https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14727620#comment-14727620 ] Junping Du commented on YARN-4053: -- Hi [~sjlee0] and all, sorry for coming late for this good discussion. bq. we're of the opinion that we can start supporting only longs for now (i.e. no floating point types), while we can consider adding a floating point type (namely double) to the list of supported types. If applications want to store value of percentage which are all between 0.0 and 1.0, what should they do? casting decimal number (0.49 and 0.50) directly to integer(long) or multiply 100 and divide 100 later? I agree that only support Long make things much easier. However, it seems cannot satisfy some basic requirements and scenarios. Supporting Long and Double for the first step (instead of int, long, float and double) sounds like a reasonable compromise though. About solutions, I think [~varun_saxena] bring up an option in YARN-3816 to use cell tag identify metrics value type which sounds good to me. Do we have any concern on this way? I am quite interested in this as we may want to apply more meta info (raw or aggregated, etc.) on metrics in future. Thoughts? > Change the way metric values are stored in HBase Storage > > > Key: YARN-4053 > URL: https://issues.apache.org/jira/browse/YARN-4053 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Attachments: YARN-4053-YARN-2928.01.patch > > > Currently HBase implementation uses GenericObjectMapper to convert and store > values in backend HBase storage. This converts everything into a string > representation(ASCII/UTF-8 encoded byte array). > While this is fine in most cases, it does not quite serve our use case for > metrics. > So we need to decide how are we going to encode and decode metric values and > store them in HBase. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4053) Change the way metric values are stored in HBase Storage
[ https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14717865#comment-14717865 ] Sangjin Lee commented on YARN-4053: --- [~vrushalic], [~jrottinghuis], and I discussed supported types a little more, and we're of the opinion that we can *start* supporting only longs for now (i.e. no floating point types), while we can consider adding a floating point type (namely double) to the list of supported types. So for now, how about assuming (and enforcing) long as the type of the metric values, and pursue how we can add double later if we need it? Thoughts? > Change the way metric values are stored in HBase Storage > > > Key: YARN-4053 > URL: https://issues.apache.org/jira/browse/YARN-4053 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Attachments: YARN-4053-YARN-2928.01.patch > > > Currently HBase implementation uses GenericObjectMapper to convert and store > values in backend HBase storage. This converts everything into a string > representation(ASCII/UTF-8 encoded byte array). > While this is fine in most cases, it does not quite serve our use case for > metrics. > So we need to decide how are we going to encode and decode metric values and > store them in HBase. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4053) Change the way metric values are stored in HBase Storage
[ https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14717863#comment-14717863 ] Sangjin Lee commented on YARN-4053: --- Thanks [~varun_saxena] for the discussion. As you said, one thing that really causes issues is when inconsistent values are used for the same metric. At a high level, I think we need to ask these questions: - How important is it to support this scenario? - If we don't really support this scenario, then what is the minimally acceptable behavior if that were to happen? The gist of the problem is that one cannot really write/read consistent values without knowing the "right" type of the metric. The user will likely not know that either for the write or read path. In the face of this, the main difference between approach #1 (encoding it into the value) and approach #2 (adding it to the column qualifier) is that approach #1 will mix different-type values into a single time series (column), and approach #2 will effectively create two separate time series (columns). The rest is the fallout. > Change the way metric values are stored in HBase Storage > > > Key: YARN-4053 > URL: https://issues.apache.org/jira/browse/YARN-4053 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Attachments: YARN-4053-YARN-2928.01.patch > > > Currently HBase implementation uses GenericObjectMapper to convert and store > values in backend HBase storage. This converts everything into a string > representation(ASCII/UTF-8 encoded byte array). > While this is fine in most cases, it does not quite serve our use case for > metrics. > So we need to decide how are we going to encode and decode metric values and > store them in HBase. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4053) Change the way metric values are stored in HBase Storage
[ https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14715192#comment-14715192 ] Vrushali C commented on YARN-4053: -- The way I see this, it comes down to a basic question of whether we really *need* floating point precision in metric values. For instance, cost is a metric which could have a decimal value upon calculation. But, in my opinion say a cost of 5 dollars versus 5.347891 dollars versus a cost of 5.78913 are not that different. A cost of 6.x dollars is different from 5.x. I believe that it does not matter THAT much that cost is 5.347891 or 5.79813. These are hadoop applications, the time duration is rarely going to be exactly consistent for the exactly same code. So metrics will usually have a slight fluctuation between different runs of the exact same job. Storage and querying of Longs is straightforward and clean. No ambiguity in serialization. Contrasting that with storage of various numerical data types in metrics: - all the complexity of storing of column prefixes that can tell us which type is stored so that serialization to/from hbase can be done correctly. - the filtering in hbase becomes so much more complicated with all these different datatypes. > Change the way metric values are stored in HBase Storage > > > Key: YARN-4053 > URL: https://issues.apache.org/jira/browse/YARN-4053 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Attachments: YARN-4053-YARN-2928.01.patch > > > Currently HBase implementation uses GenericObjectMapper to convert and store > values in backend HBase storage. This converts everything into a string > representation(ASCII/UTF-8 encoded byte array). > While this is fine in most cases, it does not quite serve our use case for > metrics. > So we need to decide how are we going to encode and decode metric values and > store them in HBase. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4053) Change the way metric values are stored in HBase Storage
[ https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14710651#comment-14710651 ] Varun Saxena commented on YARN-4053: Looking at the issues involved, IMO we should impose restriction on the client so that it does not mix longs and doubles. > Change the way metric values are stored in HBase Storage > > > Key: YARN-4053 > URL: https://issues.apache.org/jira/browse/YARN-4053 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Attachments: YARN-4053-YARN-2928.01.patch > > > Currently HBase implementation uses GenericObjectMapper to convert and store > values in backend HBase storage. This converts everything into a string > representation(ASCII/UTF-8 encoded byte array). > While this is fine in most cases, it does not quite serve our use case for > metrics. > So we need to decide how are we going to encode and decode metric values and > store them in HBase. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4053) Change the way metric values are stored in HBase Storage
[ https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14710638#comment-14710638 ] Varun Saxena commented on YARN-4053: There was a suggestion that we can support only longs. Would supporting only longs not cause any impact to potential users of ATS ? longs however should cover most of the metrics(as of now I can’t think of any where decimals would be of great importance). If we do this, I think TimelineMetric object should be changed to accept only java.lang.Long and not java.lang.Number… Looping [~vinodkv] to get his opinion on this as well. Although, is it unfair to ask client to send values consistently ? Can’t we document this and enforce this restriction. And if client does not comply, it cannot expect consistent results. This can be the contract between ATS and its clients. Major concern here though would be that it won’t be possible to enforce this restriction programmatically, neither at the client side nor at the server side. *Possible Solution :* There is one possible solution though if enforcing this restriction is not viable. The real problem in both the solutions would come in applying metric filters, if data is inconsistent. So for this, we can use approach 2(include type in column qualifier) and then insert OR filters covering both the column qualifiers for same metric. I will elaborate this with an example. Let us say we have a metric called JOB_ELAPSED_TIME and client can report both integral and floating point values for it(say). With approach 2, we will have 2 column qualifiers for this metric i.e. “ JOB_ELAPSED_TIME=L” (for longs) and “JOB_ELAPSED_TIME=D” (for doubles). Now, when a query comes with metric filter value in integer format i.e. something like JOB_ELAPSED_TIME > 40 can be transformed to corresponding HBase filter of the form (“JOB_ELAPSED_TIME=L” > 40 OR “JOB_ELAPSED_TIME=D” > 40.0). i.e. a filter list of the form (“m1” > 10 AND “m2” < 5 AND “m3”=4) would be transformed to ((“m1=L” > 10 OR “m1=D” > 10.0) AND (“m2=L” < 5 OR “m2=D” < 5.0) AND (“m3=L” = 4 OR “m3=D” = 4.0)). If filter value is in decimal format then we will have to make additional changes. If filter is something like JOB_ELAPSED_TIME > 40.75 it will have to be converted to (“JOB_ELAPSED_TIME=L” >= 41 OR “JOB_ELAPSED_TIME=D” > 40.75). As you can see here, while matching a double value against column qualifier storing longs, I would need increase the value to closest integer and change filter to >=. Likewise changes will be required for < (less than) and equal to(=) comparison as well. However, I am not sure whether adding too many filters will cause any performance issue for HBase or not. Because with this solution, we will in essence be doubling the size of metric filters. One thing we need to note though is that if we do adopt approach 2(including type in column qualifier), regex comparison might become an issue. Because theoretically regular expressions can become quite complex, so programmatically interpreting a regex and transforming it in a manner where it takes both long related column qualifier and double related column qualifier might induce bugs. Maybe we can just support wildcard match(\*) or just do with prefix and substring filters. Thoughts ? However, we may want to match against only the latest version of the value for a metric. In that case, the solution suggested above won’t work. > Change the way metric values are stored in HBase Storage > > > Key: YARN-4053 > URL: https://issues.apache.org/jira/browse/YARN-4053 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Attachments: YARN-4053-YARN-2928.01.patch > > > Currently HBase implementation uses GenericObjectMapper to convert and store > values in backend HBase storage. This converts everything into a string > representation(ASCII/UTF-8 encoded byte array). > While this is fine in most cases, it does not quite serve our use case for > metrics. > So we need to decide how are we going to encode and decode metric values and > store them in HBase. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4053) Change the way metric values are stored in HBase Storage
[ https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14710633#comment-14710633 ] Varun Saxena commented on YARN-4053: Wanted to discuss so that we can reach a consensus on how to handle YARN-4053. *Solution 1*: We can add a 1 byte flag as part of the metric value indicating whether we are storing integral value(0) or floating point value(1). *Solution 2* : Another solution suggested is that type can be part of column qualifier say something like metric=l where "l" indicates long. Another solution is to store everything as double. But would it be fair to impose this restriction on client while it reads data from ATS ? What if client is expecting a long and unable to handle a double. The major issues surrounding different approaches are that what if client does not report metric values consistently(same metric data type). Now let us look at the scenarios where metric values come into picture. *1.* While writing entity to HBase : Here, we need to consider that for the same entity, a particular metric can be reported in multiple write calls. So it is possible that in one write, all values for a particular metric are reported as long and in another write, all as floats. This can create inconsistency in both the solutions above (have different flags and encodings for same metric in Solution 1 and different column qualifiers for same metric in Solution 2). We can add a valuetype field in TimelineMetric which indicates whether a set of values are long or float. And throw an exception in TimelineMetric at the time of adding value if types are not consistent. This will atleast ensure same data type for a particular write call. But even here client should make sure that across writes they make sure data types are consistent. I think getting a row to find out column qualifier name or flags attached with the values wont be a viable option. So some sort of restriction on the part of the client(so that they send consistent data types for same metric) will have to be placed whether we adopt solution 1 or solution 2. Is there some HBase API I am not aware of ? *2.* While reading entity from HBase in the absence of any HBase filter : In this case there should be no issues in either solution 1 or solution 2. Because we read everything as bytes from HBase. We can do the appropriate conversion based on the flag or column qualifier name then. *3.* While reading entity from HBase in the presence of HBase filters : We can have 2 kinds of HBase filters. One filter is to retrieve specific columns(to determine which metrics to return) and other one is to trim down the rows/entities to be returned based on metric value comparison. The first class of filters which determine which columns to return, those should work in both the cases(Solution 1 and 2). Even in solution 2, because we use prefix filters as of now. If we use regex matching though, it might make things more complicated in case of Solution 2. For the second set of filters, we would require to know data type of the metric value in both the proposed solutions. Because SingleColumnValueFilter requires exact column qualifier name(for Solution 2). And for solution 1 also we should know the data type of metric so that we can append the value to be compared against with the flag(so that BinaryComparator can be used). If we add filters to our data object model, we can probably include data type in filters as well. But that again is dependent on client, whether it sends correct data type or not. As we saw in point 1, we need to impose restriction on the client that it sends same data type for every metric. Frankly it should be easy for client as well. If for a metric, client expects float values, it will most likely use Double or Float. Thoughts ? Or some other suggestions which can preclude the need for such a restriction. > Change the way metric values are stored in HBase Storage > > > Key: YARN-4053 > URL: https://issues.apache.org/jira/browse/YARN-4053 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Attachments: YARN-4053-YARN-2928.01.patch > > > Currently HBase implementation uses GenericObjectMapper to convert and store > values in backend HBase storage. This converts everything into a string > representation(ASCII/UTF-8 encoded byte array). > While this is fine in most cases, it does not quite serve our use case for > metrics. > So we need to decide how are we going to encode and decode metric values and > store them in HBase. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4053) Change the way metric values are stored in HBase Storage
[ https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14704368#comment-14704368 ] Varun Saxena commented on YARN-4053: bq. it might be good to restrict the numeric types the metric will support. long and double sounds good to me. Can add verification as you said. bq. HBase already provides a facility to encode and decode between numbers and bytes Yes I know. As I had to append one byte in front of the byte array, I moved the logic in Bytes.toBytes to a separate method. This was done to avoid creation of 2 byte arrays(one inside Bytes.toBytes and one in ATS code) and henceforth copying over result from Bytes.toBytes to the byte array created inside ATS code. Although this is just 8 bytes. So maybe can do above. bq. Also, instead of encoding the info whether this is an integral type vs. floating type into the value, it would be better to have this information in the column qualifier. I see some issue in having this info in column qualifier. Because certain HBase filters like SingleColumnValueFilter require exact column qualifier name. So we will have to again guess about the type(similar to current patch) when we use it. Probably we can discuss this offline and conclude there. Will send a mail. > Change the way metric values are stored in HBase Storage > > > Key: YARN-4053 > URL: https://issues.apache.org/jira/browse/YARN-4053 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Attachments: YARN-4053-YARN-2928.01.patch > > > Currently HBase implementation uses GenericObjectMapper to convert and store > values in backend HBase storage. This converts everything into a string > representation(ASCII/UTF-8 encoded byte array). > While this is fine in most cases, it does not quite serve our use case for > metrics. > So we need to decide how are we going to encode and decode metric values and > store them in HBase. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4053) Change the way metric values are stored in HBase Storage
[ https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703889#comment-14703889 ] Sangjin Lee commented on YARN-4053: --- I discussed this with [~jrottinghuis] a little more, and have some high level comments. First off, HBase already provides a facility to encode and decode between numbers and bytes: {{Bytes.toBytes(long)}}, {{Bytes.toLong(byte[])}}, etc. We should use them instead of rolling our own. Also, instead of encoding the info whether this is an integral type vs. floating type into the value, it would be better to have this information in the column qualifier. For example, if metric "foo" is of an integral type, then the column qualifier could contain "...=foo=l..." (where "l" denotes a long value). That way, we can easily read out the type of values and deserialize appropriately. It would also remove any uncertainty around having different types of data for the same metric. There may be other variations of the idea. Finally, I think it might be good to restrict the numeric types the metric will support. My proposal would be to limit it to {{Long}} and {{Double}}. They should be able to encompass pretty much all types of metrics. This would simplify support quite a bit. We can have verification in {{TimelineMetric.addValue()}} and so on to check the incoming type. > Change the way metric values are stored in HBase Storage > > > Key: YARN-4053 > URL: https://issues.apache.org/jira/browse/YARN-4053 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Attachments: YARN-4053-YARN-2928.01.patch > > > Currently HBase implementation uses GenericObjectMapper to convert and store > values in backend HBase storage. This converts everything into a string > representation(ASCII/UTF-8 encoded byte array). > While this is fine in most cases, it does not quite serve our use case for > metrics. > So we need to decide how are we going to encode and decode metric values and > store them in HBase. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4053) Change the way metric values are stored in HBase Storage
[ https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14700759#comment-14700759 ] Naganarasimha G R commented on YARN-4053: - bq. place a restriction on client that it should send values in floating point format at all times if it wants to store some metric value as floating point. We can mention this in our documentation. I think this approach is better as we will be able to have filters based on values too and less processing costs. > Change the way metric values are stored in HBase Storage > > > Key: YARN-4053 > URL: https://issues.apache.org/jira/browse/YARN-4053 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Attachments: YARN-4053-YARN-2928.01.patch > > > Currently HBase implementation uses GenericObjectMapper to convert and store > values in backend HBase storage. This converts everything into a string > representation(ASCII/UTF-8 encoded byte array). > While this is fine in most cases, it does not quite serve our use case for > metrics. > So we need to decide how are we going to encode and decode metric values and > store them in HBase. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4053) Change the way metric values are stored in HBase Storage
[ https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14700757#comment-14700757 ] Varun Saxena commented on YARN-4053: [~sjlee0], bq. I'm just curious (and perhaps this is a totally dumb question for a HBase newbie), is there a way to specify that the value type is a numeric type when we create the table or the column family? Does HBase itself support something like that? AFAIK, no there is no way to attach type with column qualifier or column family. HBase treats everything as just a sequence of bytes. It depends on the user how they encode or decode it. bq. Another scenario to think about is what if users write metric values in an inconsistent manner. Suppose the user stored an integral value for a metric initially, but later attempted to store a floating value for the same metric. It sounds like it could be a silent failure? This should be a rare occurrence, but I think we need to give it some thought... Yes I did consider this scenario. That is why I said we can place a restriction on client that it should send values in floating point format at all times if it wants to store some metric value as floating point. We can mention this in our documentation. > Change the way metric values are stored in HBase Storage > > > Key: YARN-4053 > URL: https://issues.apache.org/jira/browse/YARN-4053 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Attachments: YARN-4053-YARN-2928.01.patch > > > Currently HBase implementation uses GenericObjectMapper to convert and store > values in backend HBase storage. This converts everything into a string > representation(ASCII/UTF-8 encoded byte array). > While this is fine in most cases, it does not quite serve our use case for > metrics. > So we need to decide how are we going to encode and decode metric values and > store them in HBase. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4053) Change the way metric values are stored in HBase Storage
[ https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14700381#comment-14700381 ] Sangjin Lee commented on YARN-4053: --- And I do think that we need to support floating type values. Another scenario to think about is what if users write metric values in an inconsistent manner. Suppose the user stored an integral value for a metric initially, but later attempted to store a floating value for the same metric. It sounds like it could be a silent failure? This should be a rare occurrence, but I think we need to give it some thought... > Change the way metric values are stored in HBase Storage > > > Key: YARN-4053 > URL: https://issues.apache.org/jira/browse/YARN-4053 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Attachments: YARN-4053-YARN-2928.01.patch > > > Currently HBase implementation uses GenericObjectMapper to convert and store > values in backend HBase storage. This converts everything into a string > representation(ASCII/UTF-8 encoded byte array). > While this is fine in most cases, it does not quite serve our use case for > metrics. > So we need to decide how are we going to encode and decode metric values and > store them in HBase. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4053) Change the way metric values are stored in HBase Storage
[ https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14700378#comment-14700378 ] Sangjin Lee commented on YARN-4053: --- Thanks [~varun_saxena] for pointing out an important issue. I would agree with [~gtCarrera9] that this is bit lower in priority compared to YARN-3814, but it's an important issue nonetheless. I'm just curious (and perhaps this is a totally dumb question for a HBase newbie), is there a way to specify that the value type is a numeric type when we create the table or the column family? Does HBase itself support something like that? > Change the way metric values are stored in HBase Storage > > > Key: YARN-4053 > URL: https://issues.apache.org/jira/browse/YARN-4053 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Attachments: YARN-4053-YARN-2928.01.patch > > > Currently HBase implementation uses GenericObjectMapper to convert and store > values in backend HBase storage. This converts everything into a string > representation(ASCII/UTF-8 encoded byte array). > While this is fine in most cases, it does not quite serve our use case for > metrics. > So we need to decide how are we going to encode and decode metric values and > store them in HBase. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4053) Change the way metric values are stored in HBase Storage
[ https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14699971#comment-14699971 ] Li Lu commented on YARN-4053: - bq. Oh sorry in my original comment, I meant that we can carry this logic over to other columns(not only metrics). This looks good. bq. I agree we can include type information in TimelineMetric object itself. That will be better. Actually I believe we *are* carrying type information in TimelineMetrics, in the original (boxed) Java form. For now I think we're fine to move with float and long. > Change the way metric values are stored in HBase Storage > > > Key: YARN-4053 > URL: https://issues.apache.org/jira/browse/YARN-4053 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Attachments: YARN-4053-YARN-2928.01.patch > > > Currently HBase implementation uses GenericObjectMapper to convert and store > values in backend HBase storage. This converts everything into a string > representation(ASCII/UTF-8 encoded byte array). > While this is fine in most cases, it does not quite serve our use case for > metrics. > So we need to decide how are we going to encode and decode metric values and > store them in HBase. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4053) Change the way metric values are stored in HBase Storage
[ https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14699951#comment-14699951 ] Varun Saxena commented on YARN-4053: On second thoughts, all three types may make sense if we include filters as part of our object model and make client create and send them. Lets discuss this on Wednesday in weekly meeting. > Change the way metric values are stored in HBase Storage > > > Key: YARN-4053 > URL: https://issues.apache.org/jira/browse/YARN-4053 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Attachments: YARN-4053-YARN-2928.01.patch > > > Currently HBase implementation uses GenericObjectMapper to convert and store > values in backend HBase storage. This converts everything into a string > representation(ASCII/UTF-8 encoded byte array). > While this is fine in most cases, it does not quite serve our use case for > metrics. > So we need to decide how are we going to encode and decode metric values and > store them in HBase. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4053) Change the way metric values are stored in HBase Storage
[ https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14699907#comment-14699907 ] Varun Saxena commented on YARN-4053: Oh sorry in my original comment, I meant that we can carry this logic over to other columns(not only metrics). I agree we can include type information in TimelineMetric object itself. That will be better. By the way do you envisage metric values having anything other than float or long ? I think TimelineData.FLOAT and TimelineData.LONG should be enough. Thoughts ? > Change the way metric values are stored in HBase Storage > > > Key: YARN-4053 > URL: https://issues.apache.org/jira/browse/YARN-4053 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Attachments: YARN-4053-YARN-2928.01.patch > > > Currently HBase implementation uses GenericObjectMapper to convert and store > values in backend HBase storage. This converts everything into a string > representation(ASCII/UTF-8 encoded byte array). > While this is fine in most cases, it does not quite serve our use case for > metrics. > So we need to decide how are we going to encode and decode metric values and > store them in HBase. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4053) Change the way metric values are stored in HBase Storage
[ https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14699888#comment-14699888 ] Li Lu commented on YARN-4053: - Thanks! Tomorrow LGTM. > Change the way metric values are stored in HBase Storage > > > Key: YARN-4053 > URL: https://issues.apache.org/jira/browse/YARN-4053 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Attachments: YARN-4053-YARN-2928.01.patch > > > Currently HBase implementation uses GenericObjectMapper to convert and store > values in backend HBase storage. This converts everything into a string > representation(ASCII/UTF-8 encoded byte array). > While this is fine in most cases, it does not quite serve our use case for > metrics. > So we need to decide how are we going to encode and decode metric values and > store them in HBase. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4053) Change the way metric values are stored in HBase Storage
[ https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14699885#comment-14699885 ] Varun Saxena commented on YARN-4053: [~gtCarrera9], yes, will update a patch by tomorrow for YARN-3814 if no further comments come. Do you want it today ? > Change the way metric values are stored in HBase Storage > > > Key: YARN-4053 > URL: https://issues.apache.org/jira/browse/YARN-4053 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Attachments: YARN-4053-YARN-2928.01.patch > > > Currently HBase implementation uses GenericObjectMapper to convert and store > values in backend HBase storage. This converts everything into a string > representation(ASCII/UTF-8 encoded byte array). > While this is fine in most cases, it does not quite serve our use case for > metrics. > So we need to decide how are we going to encode and decode metric values and > store them in HBase. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4053) Change the way metric values are stored in HBase Storage
[ https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14699883#comment-14699883 ] Varun Saxena commented on YARN-4053: bq. We may have something like TimelineData.FLOAT, TimelineData.LONG, TimelineData.GENERIC_OBJECT, etc., so that we can easily transfer those messages Yes this can be done. I meant the same when I mentioned we can extend the logic in the patch attached. Depends on what everyone agrees to. > Change the way metric values are stored in HBase Storage > > > Key: YARN-4053 > URL: https://issues.apache.org/jira/browse/YARN-4053 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Attachments: YARN-4053-YARN-2928.01.patch > > > Currently HBase implementation uses GenericObjectMapper to convert and store > values in backend HBase storage. This converts everything into a string > representation(ASCII/UTF-8 encoded byte array). > While this is fine in most cases, it does not quite serve our use case for > metrics. > So we need to decide how are we going to encode and decode metric values and > store them in HBase. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4053) Change the way metric values are stored in HBase Storage
[ https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14699872#comment-14699872 ] Li Lu commented on YARN-4053: - Hi [~varun_saxena], thanks for the patch! With regard to the POC, I thought we agreed on the general plan of the POC on a web UI connected to the timeline reader during our weekly standup discussion? If this is the case, I would personally give the RESTful API patch slightly higher priority since that is critical to the whole workflow of the reader/webUI interface? About this patch, I totally agree that we should directly store the byte representation of the numbers instead of using generic object mapper. Having looked at the patch, I have some general comments here. Maybe what we want here is a way to model the types of the timeline metrics, so that the type information can be carried over from TimelineMetric objects to the storage layer? We may have something like TimelineData.FLOAT, TimelineData.LONG, TimelineData.GENERIC_OBJECT, etc., so that we can easily transfer those messages? My main concern is on the ColumnPrefix descriptions, where we now use a boolean flag to indicate if the column is numeric or not. This will also help us to better organize the serialization and deserialization helper methods and all related tests. Let me know if this idea works here, thanks! > Change the way metric values are stored in HBase Storage > > > Key: YARN-4053 > URL: https://issues.apache.org/jira/browse/YARN-4053 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Attachments: YARN-4053-YARN-2928.01.patch > > > Currently HBase implementation uses GenericObjectMapper to convert and store > values in backend HBase storage. This converts everything into a string > representation(ASCII/UTF-8 encoded byte array). > While this is fine in most cases, it does not quite serve our use case for > metrics. > So we need to decide how are we going to encode and decode metric values and > store them in HBase. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4053) Change the way metric values are stored in HBase Storage
[ https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14698608#comment-14698608 ] Hadoop QA commented on YARN-4053: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | pre-patch | 16m 3s | Findbugs (version ) appears to be broken on YARN-2928. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 7m 55s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 54s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 23s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 0m 16s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 1s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 26s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 40s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 0m 52s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | yarn tests | 1m 22s | Tests passed in hadoop-yarn-server-timelineservice. | | | | 38m 58s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12750693/YARN-4053-YARN-2928.01.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | YARN-2928 / f40c735 | | hadoop-yarn-server-timelineservice test log | https://builds.apache.org/job/PreCommit-YARN-Build/8853/artifact/patchprocess/testrun_hadoop-yarn-server-timelineservice.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/8853/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/8853/console | This message was automatically generated. > Change the way metric values are stored in HBase Storage > > > Key: YARN-4053 > URL: https://issues.apache.org/jira/browse/YARN-4053 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Attachments: YARN-4053-YARN-2928.01.patch > > > Currently HBase implementation uses GenericObjectMapper to convert and store > values in backend HBase storage. This converts everything into a string > representation(ASCII/UTF-8 encoded byte array). > While this is fine in most cases, it does not quite serve our use case for > metrics. > So we need to decide how are we going to encode and decode metric values and > store them in HBase. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4053) Change the way metric values are stored in HBase Storage
[ https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14698602#comment-14698602 ] Varun Saxena commented on YARN-4053: This patch demonstrates the approach mentioned above and works for both integral and floating point values. But for floating point values, the restriction on part of the client is that it should send values in decimal format always otherwise when I add metric filters, matching will fail. I guess its a fair enough restriction to place. In the patch, we can indicate that numerical values have to be stored per column/column prefix. We can however extend this logic for all values and indicate if values to be stored are ASCII encoded as well, so that different kind of values can be stored differently in same column. But there is no use case for this as of now, so haven't done so. I will remove the part about floating point numbers from patch, if we dont want it now. > Change the way metric values are stored in HBase Storage > > > Key: YARN-4053 > URL: https://issues.apache.org/jira/browse/YARN-4053 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Attachments: YARN-4053-YARN-2928.01.patch > > > Currently HBase implementation uses GenericObjectMapper to convert and store > values in backend HBase storage. This converts everything into a string > representation(ASCII/UTF-8 encoded byte array). > While this is fine in most cases, it does not quite serve our use case for > metrics. > So we need to decide how are we going to encode and decode metric values and > store them in HBase. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4053) Change the way metric values are stored in HBase Storage
[ https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14698209#comment-14698209 ] Varun Saxena commented on YARN-4053: Naga, yes double can also be used. But the problem with storing double is that we will be sending back a value with decimals and client may not be able to interpret it properly if its not expecting a decimal value(for integral values). IMO, we cant make assumptions about client because it may be assuming that a particular metric has been scored as an integral value in ATS. Although same problem will be with longs as well. But long is being thought of, as we are assuming we wont be supporting floating points as of now. A more concrete solution would be as under : Assume solution 2 which I proposed above. Here we can have a flag (1 byte although 1 bit should be enough but for HBase storage 1 byte would be needed). This can indicate whether a value is to be interpreted as long or double. We can take care of encoding/decoding while reading/writing from Hbase. {noformat} 18 - | | | | 0 |(Integral value stored as long) | | | | - 18 - | | | | 1 | (Floating value stored as double) | | | | - If flag is 0, it indicates integral value and if 1 it indicates floating point value. {noformat} And we can place a restriction on clients saying that if they want a metric to be of floating point format, in all interactions with ATS the value should be in decimals i.e. 40 should be sent as 40.0 for instance. That will be a fair enough restriction to place. However it all depends on whether we want to support floating points at the moment. > Change the way metric values are stored in HBase Storage > > > Key: YARN-4053 > URL: https://issues.apache.org/jira/browse/YARN-4053 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > > Currently HBase implementation uses GenericObjectMapper to convert and store > values in backend HBase storage. This converts everything into a string > representation(ASCII/UTF-8 encoded byte array). > While this is fine in most cases, it does not quite serve our use case for > metrics. > So we need to decide how are we going to encode and decode metric values and > store them in HBase. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4053) Change the way metric values are stored in HBase Storage
[ https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14698198#comment-14698198 ] Varun Saxena commented on YARN-4053: [~gtCarrera9], it is blocking metric filter implementation. We can discuss offline regarding what all you are looking at for POC of web UI. I can focus on that part first. I was under the assumption only YARN-3814 is required for Web UI POC as of now. > Change the way metric values are stored in HBase Storage > > > Key: YARN-4053 > URL: https://issues.apache.org/jira/browse/YARN-4053 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > > Currently HBase implementation uses GenericObjectMapper to convert and store > values in backend HBase storage. This converts everything into a string > representation(ASCII/UTF-8 encoded byte array). > While this is fine in most cases, it does not quite serve our use case for > metrics. > So we need to decide how are we going to encode and decode metric values and > store them in HBase. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4053) Change the way metric values are stored in HBase Storage
[ https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14697967#comment-14697967 ] Naganarasimha G R commented on YARN-4053: - [~vrushalic] how about double ? I feel it would be the better as it too takes the same size of long (8 bytes) and supports decimals too ? > Change the way metric values are stored in HBase Storage > > > Key: YARN-4053 > URL: https://issues.apache.org/jira/browse/YARN-4053 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > > Currently HBase implementation uses GenericObjectMapper to convert and store > values in backend HBase storage. This converts everything into a string > representation(ASCII/UTF-8 encoded byte array). > While this is fine in most cases, it does not quite serve our use case for > metrics. > So we need to decide how are we going to encode and decode metric values and > store them in HBase. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4053) Change the way metric values are stored in HBase Storage
[ https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14697968#comment-14697968 ] Naganarasimha G R commented on YARN-4053: - [~vrushalic] how about double ? I feel it would be the better as it too takes the same size of long (8 bytes) and supports decimals too ? > Change the way metric values are stored in HBase Storage > > > Key: YARN-4053 > URL: https://issues.apache.org/jira/browse/YARN-4053 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > > Currently HBase implementation uses GenericObjectMapper to convert and store > values in backend HBase storage. This converts everything into a string > representation(ASCII/UTF-8 encoded byte array). > While this is fine in most cases, it does not quite serve our use case for > metrics. > So we need to decide how are we going to encode and decode metric values and > store them in HBase. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4053) Change the way metric values are stored in HBase Storage
[ https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1469#comment-1469 ] Li Lu commented on YARN-4053: - Hi [~varun_saxena], I agree this is a valid issue. Before we get deep involvement into this issue, I'm wondering if this is blocking any of our ongoing tasks to finish our planned POC of the reader and web UI? > Change the way metric values are stored in HBase Storage > > > Key: YARN-4053 > URL: https://issues.apache.org/jira/browse/YARN-4053 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > > Currently HBase implementation uses GenericObjectMapper to convert and store > values in backend HBase storage. This converts everything into a string > representation(ASCII/UTF-8 encoded byte array). > While this is fine in most cases, it does not quite serve our use case for > metrics. > So we need to decide how are we going to encode and decode metric values and > store them in HBase. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4053) Change the way metric values are stored in HBase Storage
[ https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14697583#comment-14697583 ] Vrushali C commented on YARN-4053: -- Hmm good points. I think all metrics should be stored of the same type else we have to deal with knowing which metric is of which type and would need to store metadata to know how to read it back. Storing it as an ascii value is not good, we need to be able to query for things like less than greater than etc. My vote is for going with Longs for all metrics right now unless there is a very strong use case where only decimals will do. We truncate (cast down) decimals to long if we receive any, so 99.9 means 99. I realize this is restrictive but my thinking is that instead of trying to do everything for this current ATS release, let's go with Longs and see if we really need decimal precision. If we do, we can revisit and modify to accept more data types. cc [~jrottinghuis] > Change the way metric values are stored in HBase Storage > > > Key: YARN-4053 > URL: https://issues.apache.org/jira/browse/YARN-4053 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > > Currently HBase implementation uses GenericObjectMapper to convert and store > values in backend HBase storage. This converts everything into a string > representation(ASCII/UTF-8 encoded byte array). > While this is fine in most cases, it does not quite serve our use case for > metrics. > So we need to decide how are we going to encode and decode metric values and > store them in HBase. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4053) Change the way metric values are stored in HBase Storage
[ https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14697571#comment-14697571 ] Varun Saxena commented on YARN-4053: Tez may not be publishing any floating point metric as of now. I am not too sure about what all they publish. So probably there is no use case as of now. But if we do not support floating point numbers, then we should clearly document that we will only support integral values. And do the conversion in writer if any floating point value comes. > Change the way metric values are stored in HBase Storage > > > Key: YARN-4053 > URL: https://issues.apache.org/jira/browse/YARN-4053 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > > Currently HBase implementation uses GenericObjectMapper to convert and store > values in backend HBase storage. This converts everything into a string > representation(ASCII/UTF-8 encoded byte array). > While this is fine in most cases, it does not quite serve our use case for > metrics. > So we need to decide how are we going to encode and decode metric values and > store them in HBase. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4053) Change the way metric values are stored in HBase Storage
[ https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14697556#comment-14697556 ] Varun Saxena commented on YARN-4053: bq. What kind of metrics do you have in mind that will have floating point numbers ? There was some plan for reporting some cluster level metrics in future too, few of them would be floating point as well. Refer to json in YARN-3881 Also I remember some discussion during aggregation design regarding storing averages. Are we planning to calculate them on the fly instead ? Moreover,TimelineMetric stores metric value as a {{java.lang.Number}}. This means we are saying metric can store a floating point value as well. As we have no control over systems outside YARN(say Tez), if they use ATS and publish a metric of floating type, I guess we should be able to handle it. Thoughts ? If it has been decided that metrics can only be integral values, then its fine. Wont have to take care of it then. Let me know. Also, another key point we need to decide is that do we only support values till signed longs(8 bytes) ? > Change the way metric values are stored in HBase Storage > > > Key: YARN-4053 > URL: https://issues.apache.org/jira/browse/YARN-4053 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > > Currently HBase implementation uses GenericObjectMapper to convert and store > values in backend HBase storage. This converts everything into a string > representation(ASCII/UTF-8 encoded byte array). > While this is fine in most cases, it does not quite serve our use case for > metrics. > So we need to decide how are we going to encode and decode metric values and > store them in HBase. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4053) Change the way metric values are stored in HBase Storage
[ https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14697500#comment-14697500 ] Vrushali C commented on YARN-4053: -- I think metric values should be stored (and read back) as Longs. What kind of metrics do you have in mind that will have floating point numbers? Any percentages that we want to store? I don't think we really need that level of precision. > Change the way metric values are stored in HBase Storage > > > Key: YARN-4053 > URL: https://issues.apache.org/jira/browse/YARN-4053 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > > Currently HBase implementation uses GenericObjectMapper to convert and store > values in backend HBase storage. This converts everything into a string > representation(ASCII/UTF-8 encoded byte array). > While this is fine in most cases, it does not quite serve our use case for > metrics. > So we need to decide how are we going to encode and decode metric values and > store them in HBase. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4053) Change the way metric values are stored in HBase Storage
[ https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14697454#comment-14697454 ] Varun Saxena commented on YARN-4053: cc [~sjlee0], [~djp], [~zjshen], [~vinodkv]. Thoughts ? Will implement one of the options above depending on whatever is the consensus. > Change the way metric values are stored in HBase Storage > > > Key: YARN-4053 > URL: https://issues.apache.org/jira/browse/YARN-4053 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > > Currently HBase implementation uses GenericObjectMapper is used to convert > and store values in backend HBase storage. This converts everything into a > string representation(ASCII/UTF-8 encoded byte array). > While this is fine in most cases, it does not quite serve our use case for > metrics. > So we need to decide how are we going to encode and decode metric values and > store them in HBase. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4053) Change the way metric values are stored in HBase Storage
[ https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14697449#comment-14697449 ] Varun Saxena commented on YARN-4053: Also for floating point metrics, query can be in integral form. This can create issues too. We should clearly document that query should also be in decimal representation for such metrics. That is, checking for condition like m1 > 40 should be mean query from client should have filter as {{m1 > 40.0}} in REST API. So that its interpreted as a floating point number > Change the way metric values are stored in HBase Storage > > > Key: YARN-4053 > URL: https://issues.apache.org/jira/browse/YARN-4053 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > > Currently HBase implementation uses GenericObjectMapper is used to convert > and store values in backend HBase storage. This converts everything into a > string representation(ASCII/UTF-8 encoded byte array). > While this is fine in most cases, it does not quite serve our use case for > metrics. > So we need to decide how are we going to encode and decode metric values and > store them in HBase. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4053) Change the way metric values are stored in HBase Storage
[ https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14697445#comment-14697445 ] Varun Saxena commented on YARN-4053: So to resolve this we need some other way of storing metric values. Options are as under : # Keep the current way of storing metric values. And write a custom filter to match the values. But this would need the new filter to be deployed on all region servers. This solution hence may not be feasible. But if we do not want to do this, for lexicographic comparison to work, sizes of bytes compared should be equal. # Store values as primitive types. That is, long as 8 bytes, integer as 4 bytes and so on. But this can create problems in lexicographical comparison too. Say metric m1 is stored as long. But a query to reader might be of the form {{m1 > 4}}. As 4 will be interpreted as Integer, we will try to compare 4 bytes against 8 bytes. So the solution for this is to store every integral value as long(8 bytes) and floating point values as double. Same approach can be used while matching at reader side. # But above solution may not work if we want to support BigInteger and BigDecimal values(i.e. numerical values > 8 bytes). Although 8 bytes should be enough but aggregated values may exceed 8 bytes. In this case, we can probably decide values upto how many bytes do we need to support. 16 bytes, for that matter even 12 bytes should be more than enough for all realistic scenarios. While encoding we can do padding with zeroes in front if number is less than 16 bytes. # Another option can be to continue supporting string representation and restrict max number of digits we want to support before and after decimal point. Say 30 digits before decimal point and 3 after. We can pad rest of the bytes with zeroes while storing so that comparison can be done. > Change the way metric values are stored in HBase Storage > > > Key: YARN-4053 > URL: https://issues.apache.org/jira/browse/YARN-4053 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > > Currently HBase implementation uses GenericObjectMapper is used to convert > and store values in backend HBase storage. This converts everything into a > string representation(ASCII/UTF-8 encoded byte array). > While this is fine in most cases, it does not quite serve our use case for > metrics. > So we need to decide how are we going to encode and decode metric values and > store them in HBase. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4053) Change the way metric values are stored in HBase Storage
[ https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14697443#comment-14697443 ] Varun Saxena commented on YARN-4053: Storing metric values(which are numbers) as string is fine if we want to check them for equality. But we have to support all relational operations for metrics. And that is where string representation doesnt work. This is because in HBase, filters currently use lexicographical comparison. This means that with current mechanism to store metric values, a value of 4000 will be judged as smaller than 60. > Change the way metric values are stored in HBase Storage > > > Key: YARN-4053 > URL: https://issues.apache.org/jira/browse/YARN-4053 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > > Currently HBase implementation uses GenericObjectMapper is used to convert > and store values in backend HBase storage. This converts everything into a > string representation(ASCII/UTF-8 encoded byte array). > While this is fine in most cases, it does not quite serve our use case for > metrics. > So we need to decide how are we going to encode and decode metric values and > store them in HBase. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)