[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors
[ https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15369751#comment-15369751 ] Hudson commented on YARN-5109: -- SUCCESS: Integrated in Hadoop-trunk-Commit #10074 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/10074/]) YARN-5109. timestamps are stored unencoded causing parse errors (Varun (sjlee: rev 7b8cfa5c2ff62005c8b78867fedd64b48e50383d) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/common/AppIdKeyConverter.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/test/java/org/apache/hadoop/yarn/server/timelineservice/storage/common/TestKeyConverters.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/flow/FlowActivityRowKeyConverter.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/flow/FlowRunRowKeyConverter.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests/src/test/java/org/apache/hadoop/yarn/server/timelineservice/storage/flow/TestHBaseStorageFlowActivity.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/reader/TimelineEntityReader.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/common/Separator.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/flow/FlowRunRowKey.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/apptoflow/AppToFlowRowKey.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/common/StringKeyConverter.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/flow/FlowActivityRowKey.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/flow/FlowRunColumnPrefix.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/reader/FlowActivityEntityReader.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests/src/test/java/org/apache/hadoop/yarn/server/timelineservice/storage/TestHBaseTimelineStorage.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/common/KeyConverter.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/apptoflow/AppToFlowRowKeyConverter.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/application/ApplicationRowKey.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/application/ApplicationColumnPrefix.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/common/EventColumnName.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/entity/EntityRowKeyConverter.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/test/java/org/apache/hadoop/yarn/server/timelineservice/storage/common/TestTimelineStorageUtils.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/entity/EntityColumnPrefix.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/common/TimelineStorageUtils.java *
[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors
[ https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15304659#comment-15304659 ] Varun Saxena commented on YARN-5109: Thanks [~sjlee0] and [~jrottinghuis] for the review and commit. And for suggesting the approach taken in the JIRA. > timestamps are stored unencoded causing parse errors > > > Key: YARN-5109 > URL: https://issues.apache.org/jira/browse/YARN-5109 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena >Priority: Blocker > Labels: yarn-2928-1st-milestone > Fix For: YARN-2928 > > Attachments: YARN-5109-YARN-2928.003.patch, > YARN-5109-YARN-2928.01.patch, YARN-5109-YARN-2928.02.patch, > YARN-5109-YARN-2928.03.patch, YARN-5109-YARN-2928.04.patch, > YARN-5109-YARN-2928.05.patch, YARN-5109-YARN-2928.06.patch, > YARN-5109-YARN-2928.07.patch, YARN-5109-YARN-2928.08.patch > > > When we store timestamps (for example as part of the row key or part of the > column name for an event), the bytes are used as is without any encoding. If > the byte value happens to contain a separator character we use (e.g. "!" or > "="), it causes a parse failure when we read it. > I came across this while looking into this error in the timeline reader: > {noformat} > 2016-05-17 21:28:38,643 WARN > org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils: > incorrectly formatted column name: it will be discarded > {noformat} > I traced the data that was causing this, and the column name (for the event) > was the following: > {noformat} > i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST > {noformat} > Note that the column name is supposed to be of the format (event > id)=(timestamp)=(event info key). However, observe the timestamp portion: > {noformat} > \x7F\xFF\xFE\xABDY=\x99 > {noformat} > The presence of the separator ("=") causes the parse error. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors
[ https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15303236#comment-15303236 ] Sangjin Lee commented on YARN-5109: --- I am also +1 on the latest patch. I'll wait until the EOD for last minute comments before I commit it. > timestamps are stored unencoded causing parse errors > > > Key: YARN-5109 > URL: https://issues.apache.org/jira/browse/YARN-5109 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena >Priority: Blocker > Labels: yarn-2928-1st-milestone > Attachments: YARN-5109-YARN-2928.003.patch, > YARN-5109-YARN-2928.01.patch, YARN-5109-YARN-2928.02.patch, > YARN-5109-YARN-2928.03.patch, YARN-5109-YARN-2928.04.patch, > YARN-5109-YARN-2928.05.patch, YARN-5109-YARN-2928.06.patch, > YARN-5109-YARN-2928.07.patch, YARN-5109-YARN-2928.08.patch > > > When we store timestamps (for example as part of the row key or part of the > column name for an event), the bytes are used as is without any encoding. If > the byte value happens to contain a separator character we use (e.g. "!" or > "="), it causes a parse failure when we read it. > I came across this while looking into this error in the timeline reader: > {noformat} > 2016-05-17 21:28:38,643 WARN > org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils: > incorrectly formatted column name: it will be discarded > {noformat} > I traced the data that was causing this, and the column name (for the event) > was the following: > {noformat} > i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST > {noformat} > Note that the column name is supposed to be of the format (event > id)=(timestamp)=(event info key). However, observe the timestamp portion: > {noformat} > \x7F\xFF\xFE\xABDY=\x99 > {noformat} > The presence of the separator ("=") causes the parse error. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors
[ https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15303184#comment-15303184 ] Joep Rottinghuis commented on YARN-5109: +1: 07 patch looks good to me. Thanks for all the patience in going back and forth on things [~varun_saxena], I think we ended up with a nice clean solution! > timestamps are stored unencoded causing parse errors > > > Key: YARN-5109 > URL: https://issues.apache.org/jira/browse/YARN-5109 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena >Priority: Blocker > Labels: yarn-2928-1st-milestone > Attachments: YARN-5109-YARN-2928.003.patch, > YARN-5109-YARN-2928.01.patch, YARN-5109-YARN-2928.02.patch, > YARN-5109-YARN-2928.03.patch, YARN-5109-YARN-2928.04.patch, > YARN-5109-YARN-2928.05.patch, YARN-5109-YARN-2928.06.patch, > YARN-5109-YARN-2928.07.patch, YARN-5109-YARN-2928.08.patch > > > When we store timestamps (for example as part of the row key or part of the > column name for an event), the bytes are used as is without any encoding. If > the byte value happens to contain a separator character we use (e.g. "!" or > "="), it causes a parse failure when we read it. > I came across this while looking into this error in the timeline reader: > {noformat} > 2016-05-17 21:28:38,643 WARN > org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils: > incorrectly formatted column name: it will be discarded > {noformat} > I traced the data that was causing this, and the column name (for the event) > was the following: > {noformat} > i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST > {noformat} > Note that the column name is supposed to be of the format (event > id)=(timestamp)=(event info key). However, observe the timestamp portion: > {noformat} > \x7F\xFF\xFE\xABDY=\x99 > {noformat} > The presence of the separator ("=") causes the parse error. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors
[ https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15302969#comment-15302969 ] Hadoop QA commented on YARN-5109: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 12s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 5 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 28s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 13s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 23s {color} | {color:green} YARN-2928 passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 31s {color} | {color:green} YARN-2928 passed with JDK v1.7.0_101 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 27s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 45s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 32s {color} | {color:green} YARN-2928 passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 36s {color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests in YARN-2928 has 30 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s {color} | {color:green} YARN-2928 passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 30s {color} | {color:green} YARN-2928 passed with JDK v1.7.0_101 {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 37s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 17s {color} | {color:green} the patch passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 17s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 31s {color} | {color:green} the patch passed with JDK v1.7.0_101 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 31s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 25s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 42s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 28s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 33s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 22s {color} | {color:green} the patch passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s {color} | {color:green} the patch passed with JDK v1.7.0_101 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 41s {color} | {color:green} hadoop-yarn-server-timelineservice in the patch passed with JDK v1.8.0_91. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 25s {color} | {color:green} hadoop-yarn-server-timelineservice-hbase-tests in the patch passed with JDK v1.8.0_91. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 47s {color} | {color:green} hadoop-yarn-server-timelineservice in the patch passed with JDK v1.7.0_101. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 40s {color} | {color:green} hadoop-yarn-server-timelineservice-hbase-tests in the patch passed with JDK v1.7.0_101. {color} | | {color:green}+1{color} | {color:green} asflicense {color} |
[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors
[ https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15302963#comment-15302963 ] Varun Saxena commented on YARN-5109: [~jrottinghuis], [~sjlee0], kindly review. The build is clean. Findbugs issues are in timelineservice-hbase-tests and will be handled by YARN-5142. The changes in latest patch over and above the 07 patch are related to adding TAB as a separator. And adding encoding/decoding for spaces and tabs in row keys. Also added code for encoding tabs in column names. > timestamps are stored unencoded causing parse errors > > > Key: YARN-5109 > URL: https://issues.apache.org/jira/browse/YARN-5109 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena >Priority: Blocker > Labels: yarn-2928-1st-milestone > Attachments: YARN-5109-YARN-2928.003.patch, > YARN-5109-YARN-2928.01.patch, YARN-5109-YARN-2928.02.patch, > YARN-5109-YARN-2928.03.patch, YARN-5109-YARN-2928.04.patch, > YARN-5109-YARN-2928.05.patch, YARN-5109-YARN-2928.06.patch, > YARN-5109-YARN-2928.07.patch, YARN-5109-YARN-2928.08.patch > > > When we store timestamps (for example as part of the row key or part of the > column name for an event), the bytes are used as is without any encoding. If > the byte value happens to contain a separator character we use (e.g. "!" or > "="), it causes a parse failure when we read it. > I came across this while looking into this error in the timeline reader: > {noformat} > 2016-05-17 21:28:38,643 WARN > org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils: > incorrectly formatted column name: it will be discarded > {noformat} > I traced the data that was causing this, and the column name (for the event) > was the following: > {noformat} > i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST > {noformat} > Note that the column name is supposed to be of the format (event > id)=(timestamp)=(event info key). However, observe the timestamp portion: > {noformat} > \x7F\xFF\xFE\xABDY=\x99 > {noformat} > The presence of the separator ("=") causes the parse error. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors
[ https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15302958#comment-15302958 ] Hadoop QA commented on YARN-5109: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 16s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 5 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 23s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 7s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 16s {color} | {color:green} YARN-2928 passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 32s {color} | {color:green} YARN-2928 passed with JDK v1.7.0_101 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 27s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 48s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 32s {color} | {color:green} YARN-2928 passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 36s {color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests in YARN-2928 has 30 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s {color} | {color:green} YARN-2928 passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 30s {color} | {color:green} YARN-2928 passed with JDK v1.7.0_101 {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 37s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 16s {color} | {color:green} the patch passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 16s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 30s {color} | {color:green} the patch passed with JDK v1.7.0_101 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 30s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 25s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 42s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 28s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 31s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 21s {color} | {color:green} the patch passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s {color} | {color:green} the patch passed with JDK v1.7.0_101 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 40s {color} | {color:green} hadoop-yarn-server-timelineservice in the patch passed with JDK v1.8.0_91. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 50s {color} | {color:green} hadoop-yarn-server-timelineservice-hbase-tests in the patch passed with JDK v1.8.0_91. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 47s {color} | {color:green} hadoop-yarn-server-timelineservice in the patch passed with JDK v1.7.0_101. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 4m 8s {color} | {color:green} hadoop-yarn-server-timelineservice-hbase-tests in the patch passed with JDK v1.7.0_101. {color} | | {color:green}+1{color} | {color:green} asflicense {color} |
[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors
[ https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15302764#comment-15302764 ] Varun Saxena commented on YARN-5109: [~jrottinghuis], I am currently writing code for encoding tabs and spaces in row keys and column names. Will update patch shortly. > timestamps are stored unencoded causing parse errors > > > Key: YARN-5109 > URL: https://issues.apache.org/jira/browse/YARN-5109 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena >Priority: Blocker > Labels: yarn-2928-1st-milestone > Attachments: YARN-5109-YARN-2928.003.patch, > YARN-5109-YARN-2928.01.patch, YARN-5109-YARN-2928.02.patch, > YARN-5109-YARN-2928.03.patch, YARN-5109-YARN-2928.04.patch, > YARN-5109-YARN-2928.05.patch, YARN-5109-YARN-2928.06.patch, > YARN-5109-YARN-2928.07.patch > > > When we store timestamps (for example as part of the row key or part of the > column name for an event), the bytes are used as is without any encoding. If > the byte value happens to contain a separator character we use (e.g. "!" or > "="), it causes a parse failure when we read it. > I came across this while looking into this error in the timeline reader: > {noformat} > 2016-05-17 21:28:38,643 WARN > org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils: > incorrectly formatted column name: it will be discarded > {noformat} > I traced the data that was causing this, and the column name (for the event) > was the following: > {noformat} > i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST > {noformat} > Note that the column name is supposed to be of the format (event > id)=(timestamp)=(event info key). However, observe the timestamp portion: > {noformat} > \x7F\xFF\xFE\xABDY=\x99 > {noformat} > The presence of the separator ("=") causes the parse error. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors
[ https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15302736#comment-15302736 ] Joep Rottinghuis commented on YARN-5109: [~varun_saxena] let's leave that replace alone for right now. I'm about to file a separate jira with another (related issue) which would change that code anyway. Let's see if we can nail down this patch and get it in. So, String#replace and StringUtills#replace aside, is YARN-5109-YARN-2928.07.patch the patch to look at, or are there more open issues pending? > timestamps are stored unencoded causing parse errors > > > Key: YARN-5109 > URL: https://issues.apache.org/jira/browse/YARN-5109 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena >Priority: Blocker > Labels: yarn-2928-1st-milestone > Attachments: YARN-5109-YARN-2928.003.patch, > YARN-5109-YARN-2928.01.patch, YARN-5109-YARN-2928.02.patch, > YARN-5109-YARN-2928.03.patch, YARN-5109-YARN-2928.04.patch, > YARN-5109-YARN-2928.05.patch, YARN-5109-YARN-2928.06.patch, > YARN-5109-YARN-2928.07.patch > > > When we store timestamps (for example as part of the row key or part of the > column name for an event), the bytes are used as is without any encoding. If > the byte value happens to contain a separator character we use (e.g. "!" or > "="), it causes a parse failure when we read it. > I came across this while looking into this error in the timeline reader: > {noformat} > 2016-05-17 21:28:38,643 WARN > org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils: > incorrectly formatted column name: it will be discarded > {noformat} > I traced the data that was causing this, and the column name (for the event) > was the following: > {noformat} > i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST > {noformat} > Note that the column name is supposed to be of the format (event > id)=(timestamp)=(event info key). However, observe the timestamp portion: > {noformat} > \x7F\xFF\xFE\xABDY=\x99 > {noformat} > The presence of the separator ("=") causes the parse error. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors
[ https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15302672#comment-15302672 ] Varun Saxena commented on YARN-5109: [~sjlee0], [~jrottinghuis] In Separator#encode, we are using String#replace which in turn uses Pattern. Why dont we use StringUtils#replace instead ? I think former would be slower. Thoughts ? > timestamps are stored unencoded causing parse errors > > > Key: YARN-5109 > URL: https://issues.apache.org/jira/browse/YARN-5109 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena >Priority: Blocker > Labels: yarn-2928-1st-milestone > Attachments: YARN-5109-YARN-2928.003.patch, > YARN-5109-YARN-2928.01.patch, YARN-5109-YARN-2928.02.patch, > YARN-5109-YARN-2928.03.patch, YARN-5109-YARN-2928.04.patch, > YARN-5109-YARN-2928.05.patch, YARN-5109-YARN-2928.06.patch, > YARN-5109-YARN-2928.07.patch > > > When we store timestamps (for example as part of the row key or part of the > column name for an event), the bytes are used as is without any encoding. If > the byte value happens to contain a separator character we use (e.g. "!" or > "="), it causes a parse failure when we read it. > I came across this while looking into this error in the timeline reader: > {noformat} > 2016-05-17 21:28:38,643 WARN > org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils: > incorrectly formatted column name: it will be discarded > {noformat} > I traced the data that was causing this, and the column name (for the event) > was the following: > {noformat} > i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST > {noformat} > Note that the column name is supposed to be of the format (event > id)=(timestamp)=(event info key). However, observe the timestamp portion: > {noformat} > \x7F\xFF\xFE\xABDY=\x99 > {noformat} > The presence of the separator ("=") causes the parse error. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors
[ https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15302651#comment-15302651 ] Varun Saxena commented on YARN-5109: testWriteNullApplicationToHBase was failing due to the test case itself. It did not show up earlier because of the sequence of tests being run I guess. We were setting Scan#setStartRow in this test but setting a stop row which meant a row inserted in the new test I added was being picked up. Will fix the test case in this JIRA itself. > timestamps are stored unencoded causing parse errors > > > Key: YARN-5109 > URL: https://issues.apache.org/jira/browse/YARN-5109 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena >Priority: Blocker > Labels: yarn-2928-1st-milestone > Attachments: YARN-5109-YARN-2928.003.patch, > YARN-5109-YARN-2928.01.patch, YARN-5109-YARN-2928.02.patch, > YARN-5109-YARN-2928.03.patch, YARN-5109-YARN-2928.04.patch, > YARN-5109-YARN-2928.05.patch, YARN-5109-YARN-2928.06.patch, > YARN-5109-YARN-2928.07.patch > > > When we store timestamps (for example as part of the row key or part of the > column name for an event), the bytes are used as is without any encoding. If > the byte value happens to contain a separator character we use (e.g. "!" or > "="), it causes a parse failure when we read it. > I came across this while looking into this error in the timeline reader: > {noformat} > 2016-05-17 21:28:38,643 WARN > org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils: > incorrectly formatted column name: it will be discarded > {noformat} > I traced the data that was causing this, and the column name (for the event) > was the following: > {noformat} > i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST > {noformat} > Note that the column name is supposed to be of the format (event > id)=(timestamp)=(event info key). However, observe the timestamp portion: > {noformat} > \x7F\xFF\xFE\xABDY=\x99 > {noformat} > The presence of the separator ("=") causes the parse error. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors
[ https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15302548#comment-15302548 ] Hadoop QA commented on YARN-5109: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 18s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 5 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 55s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 53s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 49s {color} | {color:green} YARN-2928 passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 53s {color} | {color:green} YARN-2928 passed with JDK v1.7.0_101 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 29s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 52s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 35s {color} | {color:green} YARN-2928 passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 13s {color} | {color:red} branch/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests no findbugs output file (hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests/target/findbugsXml.xml) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 33s {color} | {color:green} YARN-2928 passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 40s {color} | {color:green} YARN-2928 passed with JDK v1.7.0_101 {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 12s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 14s {color} | {color:red} hadoop-yarn-server-timelineservice-hbase-tests in the patch failed. {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 1m 43s {color} | {color:red} hadoop-yarn-server in the patch failed with JDK v1.8.0_91. {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 1m 43s {color} | {color:red} hadoop-yarn-server in the patch failed with JDK v1.8.0_91. {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 1m 50s {color} | {color:red} hadoop-yarn-server in the patch failed with JDK v1.7.0_101. {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 1m 50s {color} | {color:red} hadoop-yarn-server in the patch failed with JDK v1.7.0_101. {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 35s {color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server: patch generated 10 new + 2 unchanged - 0 fixed = 12 total (was 2) {color} | | {color:red}-1{color} | {color:red} mvnsite {color} | {color:red} 0m 18s {color} | {color:red} hadoop-yarn-server-timelineservice-hbase-tests in the patch failed. {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 30s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 14s {color} | {color:red} hadoop-yarn-server-timelineservice-hbase-tests in the patch failed. {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 1m 37s {color} | {color:red} hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-timelineservice-jdk1.8.0_91 with JDK v1.8.0_91 generated 2 new + 0 unchanged - 0 fixed = 2 total (was 0) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s {color} | {color:green} the patch passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 30s {color} | {color:green} the patch passed with JDK v1.7.0_101 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 57s {color} | {color:green}
[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors
[ https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15302534#comment-15302534 ] Varun Saxena commented on YARN-5109: Yeah, hadnt rebased the branch. Will check. > timestamps are stored unencoded causing parse errors > > > Key: YARN-5109 > URL: https://issues.apache.org/jira/browse/YARN-5109 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena >Priority: Blocker > Labels: yarn-2928-1st-milestone > Attachments: YARN-5109-YARN-2928.003.patch, > YARN-5109-YARN-2928.01.patch, YARN-5109-YARN-2928.02.patch, > YARN-5109-YARN-2928.03.patch, YARN-5109-YARN-2928.04.patch, > YARN-5109-YARN-2928.05.patch, YARN-5109-YARN-2928.06.patch, > YARN-5109-YARN-2928.07.patch > > > When we store timestamps (for example as part of the row key or part of the > column name for an event), the bytes are used as is without any encoding. If > the byte value happens to contain a separator character we use (e.g. "!" or > "="), it causes a parse failure when we read it. > I came across this while looking into this error in the timeline reader: > {noformat} > 2016-05-17 21:28:38,643 WARN > org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils: > incorrectly formatted column name: it will be discarded > {noformat} > I traced the data that was causing this, and the column name (for the event) > was the following: > {noformat} > i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST > {noformat} > Note that the column name is supposed to be of the format (event > id)=(timestamp)=(event info key). However, observe the timestamp portion: > {noformat} > \x7F\xFF\xFE\xABDY=\x99 > {noformat} > The presence of the separator ("=") causes the parse error. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors
[ https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15302528#comment-15302528 ] Sangjin Lee commented on YARN-5109: --- Also, one of the unit tests is failing. Haven't checked again if the issue exists in the branch itself, but I have been running the tests regularly and haven't seen this, so it might be introduced by the patch. {noformat} testWriteNullApplicationToHBase(org.apache.hadoop.yarn.server.timelineservice.storage.TestHBaseTimelineStorage) Time elapsed: 0.09 sec <<< FAILURE! java.lang.AssertionError: expected:<0> but was:<1> at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.failNotEquals(Assert.java:743) at org.junit.Assert.assertEquals(Assert.java:118) at org.junit.Assert.assertEquals(Assert.java:555) at org.junit.Assert.assertEquals(Assert.java:542) at org.apache.hadoop.yarn.server.timelineservice.storage.TestHBaseTimelineStorage.testWriteNullApplicationToHBase(TestHBaseTimelineStorage.java:536) {noformat} > timestamps are stored unencoded causing parse errors > > > Key: YARN-5109 > URL: https://issues.apache.org/jira/browse/YARN-5109 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena >Priority: Blocker > Labels: yarn-2928-1st-milestone > Attachments: YARN-5109-YARN-2928.003.patch, > YARN-5109-YARN-2928.01.patch, YARN-5109-YARN-2928.02.patch, > YARN-5109-YARN-2928.03.patch, YARN-5109-YARN-2928.04.patch, > YARN-5109-YARN-2928.05.patch, YARN-5109-YARN-2928.06.patch, > YARN-5109-YARN-2928.07.patch > > > When we store timestamps (for example as part of the row key or part of the > column name for an event), the bytes are used as is without any encoding. If > the byte value happens to contain a separator character we use (e.g. "!" or > "="), it causes a parse failure when we read it. > I came across this while looking into this error in the timeline reader: > {noformat} > 2016-05-17 21:28:38,643 WARN > org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils: > incorrectly formatted column name: it will be discarded > {noformat} > I traced the data that was causing this, and the column name (for the event) > was the following: > {noformat} > i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST > {noformat} > Note that the column name is supposed to be of the format (event > id)=(timestamp)=(event info key). However, observe the timestamp portion: > {noformat} > \x7F\xFF\xFE\xABDY=\x99 > {noformat} > The presence of the separator ("=") causes the parse error. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors
[ https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15302520#comment-15302520 ] Varun Saxena commented on YARN-5109: Ohh...I hadn't updated my branch. Let me fix this and update. > timestamps are stored unencoded causing parse errors > > > Key: YARN-5109 > URL: https://issues.apache.org/jira/browse/YARN-5109 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena >Priority: Blocker > Labels: yarn-2928-1st-milestone > Attachments: YARN-5109-YARN-2928.003.patch, > YARN-5109-YARN-2928.01.patch, YARN-5109-YARN-2928.02.patch, > YARN-5109-YARN-2928.03.patch, YARN-5109-YARN-2928.04.patch, > YARN-5109-YARN-2928.05.patch, YARN-5109-YARN-2928.06.patch, > YARN-5109-YARN-2928.07.patch > > > When we store timestamps (for example as part of the row key or part of the > column name for an event), the bytes are used as is without any encoding. If > the byte value happens to contain a separator character we use (e.g. "!" or > "="), it causes a parse failure when we read it. > I came across this while looking into this error in the timeline reader: > {noformat} > 2016-05-17 21:28:38,643 WARN > org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils: > incorrectly formatted column name: it will be discarded > {noformat} > I traced the data that was causing this, and the column name (for the event) > was the following: > {noformat} > i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST > {noformat} > Note that the column name is supposed to be of the format (event > id)=(timestamp)=(event info key). However, observe the timestamp portion: > {noformat} > \x7F\xFF\xFE\xABDY=\x99 > {noformat} > The presence of the separator ("=") causes the parse error. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors
[ https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15302510#comment-15302510 ] Sangjin Lee commented on YARN-5109: --- Thanks for updating the patch [~varun_saxena]! I'll go over it one more time. FYI, it appears it doesn't compile cleanly? {noformat} [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.1:testCompile (default-testCompile) on project hadoop-yarn-server-timelineservice-hbase-tests: Compilation failure [ERROR] /Users/sjlee/git/hadoop-ats/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests/src/test/java/org/apache/hadoop/yarn/server/timelineservice/storage/TestHBaseTimelineStorage.java:[521,24] cannot find symbol [ERROR] symbol: variable Bytes [ERROR] location: class org.apache.hadoop.yarn.server.timelineservice.storage.TestHBaseTimelineStorage {noformat} Missing import? > timestamps are stored unencoded causing parse errors > > > Key: YARN-5109 > URL: https://issues.apache.org/jira/browse/YARN-5109 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena >Priority: Blocker > Labels: yarn-2928-1st-milestone > Attachments: YARN-5109-YARN-2928.003.patch, > YARN-5109-YARN-2928.01.patch, YARN-5109-YARN-2928.02.patch, > YARN-5109-YARN-2928.03.patch, YARN-5109-YARN-2928.04.patch, > YARN-5109-YARN-2928.05.patch, YARN-5109-YARN-2928.06.patch, > YARN-5109-YARN-2928.07.patch > > > When we store timestamps (for example as part of the row key or part of the > column name for an event), the bytes are used as is without any encoding. If > the byte value happens to contain a separator character we use (e.g. "!" or > "="), it causes a parse failure when we read it. > I came across this while looking into this error in the timeline reader: > {noformat} > 2016-05-17 21:28:38,643 WARN > org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils: > incorrectly formatted column name: it will be discarded > {noformat} > I traced the data that was causing this, and the column name (for the event) > was the following: > {noformat} > i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST > {noformat} > Note that the column name is supposed to be of the format (event > id)=(timestamp)=(event info key). However, observe the timestamp portion: > {noformat} > \x7F\xFF\xFE\xABDY=\x99 > {noformat} > The presence of the separator ("=") causes the parse error. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors
[ https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15302345#comment-15302345 ] Joep Rottinghuis commented on YARN-5109: Indeed it is easy to do now with the way KeyConverter and Separator are written, and yeah I was ambiguous about whether we should encode. After thinking about it a bit more I do think we should encode tabs as well. If we encode both we should ensure that we encode and decode in the same order. Probably as a general rule we should encode/decode things that are specified by a user, especially those things that we can expect to see spaces (or tabs) in, but probably as a good practice any values that comes from a user that goes into a column qualifier. > timestamps are stored unencoded causing parse errors > > > Key: YARN-5109 > URL: https://issues.apache.org/jira/browse/YARN-5109 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena >Priority: Blocker > Labels: yarn-2928-1st-milestone > Attachments: YARN-5109-YARN-2928.003.patch, > YARN-5109-YARN-2928.01.patch, YARN-5109-YARN-2928.02.patch, > YARN-5109-YARN-2928.03.patch, YARN-5109-YARN-2928.04.patch, > YARN-5109-YARN-2928.05.patch, YARN-5109-YARN-2928.06.patch > > > When we store timestamps (for example as part of the row key or part of the > column name for an event), the bytes are used as is without any encoding. If > the byte value happens to contain a separator character we use (e.g. "!" or > "="), it causes a parse failure when we read it. > I came across this while looking into this error in the timeline reader: > {noformat} > 2016-05-17 21:28:38,643 WARN > org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils: > incorrectly formatted column name: it will be discarded > {noformat} > I traced the data that was causing this, and the column name (for the event) > was the following: > {noformat} > i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST > {noformat} > Note that the column name is supposed to be of the format (event > id)=(timestamp)=(event info key). However, observe the timestamp portion: > {noformat} > \x7F\xFF\xFE\xABDY=\x99 > {noformat} > The presence of the separator ("=") causes the parse error. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors
[ https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15301889#comment-15301889 ] Varun Saxena commented on YARN-5109: bq. It would be relatively easy to encode (and decode) tabs in strings, which should just happen in one or two methods now right? Not sure if I understood your comment. You mean support for tab encoding should be easy now with KeyConverter interface ? Or you want me to encode tabs too in column qualifiers. Because from the explanation above, that should be a problem too. Right ? Also, I think we should also encode event id and event info key for spaces in EventColumnNameConverter. > timestamps are stored unencoded causing parse errors > > > Key: YARN-5109 > URL: https://issues.apache.org/jira/browse/YARN-5109 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena >Priority: Blocker > Labels: yarn-2928-1st-milestone > Attachments: YARN-5109-YARN-2928.003.patch, > YARN-5109-YARN-2928.01.patch, YARN-5109-YARN-2928.02.patch, > YARN-5109-YARN-2928.03.patch, YARN-5109-YARN-2928.04.patch, > YARN-5109-YARN-2928.05.patch, YARN-5109-YARN-2928.06.patch > > > When we store timestamps (for example as part of the row key or part of the > column name for an event), the bytes are used as is without any encoding. If > the byte value happens to contain a separator character we use (e.g. "!" or > "="), it causes a parse failure when we read it. > I came across this while looking into this error in the timeline reader: > {noformat} > 2016-05-17 21:28:38,643 WARN > org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils: > incorrectly formatted column name: it will be discarded > {noformat} > I traced the data that was causing this, and the column name (for the event) > was the following: > {noformat} > i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST > {noformat} > Note that the column name is supposed to be of the format (event > id)=(timestamp)=(event info key). However, observe the timestamp portion: > {noformat} > \x7F\xFF\xFE\xABDY=\x99 > {noformat} > The presence of the separator ("=") causes the parse error. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors
[ https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15301051#comment-15301051 ] Joep Rottinghuis commented on YARN-5109: [~varun_saxena] please go ahead with the patch. I was thinking about a couple of additional items, but that goes beyond this patch, so I'll file a separate jira and patch for those, so please go ahead with what you have and what we discussed. Wrt. encoding a long, you're right, we should probably either have a LongKeyConverter and use that to cleanly go back and forth. Note that we do already have a LongConverter implementing a slightly different interface (NumericValueConverter). We could either add an additional interface to this, or create a new class and have the implementation delegated to the existing class. bq. "By the way, we are encoding spaces in column qualifiers. Any reason why we would not want spaces in column qualifiers ? We are not using space as a separator." Yeah, while in HBase you can technically use non-printable characters or pretty much any series of bytes as column qualifiers, when working with the data, and especially administering any of this through the HBase shell, using spaces (or even tabs) make our life really difficult. We don't really expect tabs in names that are sent, but spaces are common in application names. Spaces are similarly inconvenient to deal with in rest style calls as well, but that is a slightly different matter. Tabs are further often used in mapreduce to separate keys and values, so that adds further headache. It would be relatively easy to encode (and decode) tabs in strings, which should just happen in one or two methods now right? > timestamps are stored unencoded causing parse errors > > > Key: YARN-5109 > URL: https://issues.apache.org/jira/browse/YARN-5109 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena >Priority: Blocker > Labels: yarn-2928-1st-milestone > Attachments: YARN-5109-YARN-2928.003.patch, > YARN-5109-YARN-2928.01.patch, YARN-5109-YARN-2928.02.patch, > YARN-5109-YARN-2928.03.patch, YARN-5109-YARN-2928.04.patch, > YARN-5109-YARN-2928.05.patch, YARN-5109-YARN-2928.06.patch > > > When we store timestamps (for example as part of the row key or part of the > column name for an event), the bytes are used as is without any encoding. If > the byte value happens to contain a separator character we use (e.g. "!" or > "="), it causes a parse failure when we read it. > I came across this while looking into this error in the timeline reader: > {noformat} > 2016-05-17 21:28:38,643 WARN > org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils: > incorrectly formatted column name: it will be discarded > {noformat} > I traced the data that was causing this, and the column name (for the event) > was the following: > {noformat} > i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST > {noformat} > Note that the column name is supposed to be of the format (event > id)=(timestamp)=(event info key). However, observe the timestamp portion: > {noformat} > \x7F\xFF\xFE\xABDY=\x99 > {noformat} > The presence of the separator ("=") causes the parse error. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors
[ https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15300646#comment-15300646 ] Varun Saxena commented on YARN-5109: By the way, we are encoding spaces in column qualifiers. Any reason why we would not want spaces in column qualifiers ? We are not using space as a separator. Moreover, in event column name we are not encoding spaces for event id and event info components. Is it not required ? > timestamps are stored unencoded causing parse errors > > > Key: YARN-5109 > URL: https://issues.apache.org/jira/browse/YARN-5109 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena >Priority: Blocker > Labels: yarn-2928-1st-milestone > Attachments: YARN-5109-YARN-2928.003.patch, > YARN-5109-YARN-2928.01.patch, YARN-5109-YARN-2928.02.patch, > YARN-5109-YARN-2928.03.patch, YARN-5109-YARN-2928.04.patch, > YARN-5109-YARN-2928.05.patch, YARN-5109-YARN-2928.06.patch > > > When we store timestamps (for example as part of the row key or part of the > column name for an event), the bytes are used as is without any encoding. If > the byte value happens to contain a separator character we use (e.g. "!" or > "="), it causes a parse failure when we read it. > I came across this while looking into this error in the timeline reader: > {noformat} > 2016-05-17 21:28:38,643 WARN > org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils: > incorrectly formatted column name: it will be discarded > {noformat} > I traced the data that was causing this, and the column name (for the event) > was the following: > {noformat} > i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST > {noformat} > Note that the column name is supposed to be of the format (event > id)=(timestamp)=(event info key). However, observe the timestamp portion: > {noformat} > \x7F\xFF\xFE\xABDY=\x99 > {noformat} > The presence of the separator ("=") causes the parse error. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors
[ https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15300570#comment-15300570 ] Varun Saxena commented on YARN-5109: Regarding point 1, I had seen this while coding. GenericObjectMapper will basically generate a String representation of long. And for longs, space encoding wont be required. But to make it consistent, I can probably convert long to String instead of calling GenericObjectMapper and call StringKeyConverter. For point 2, I think we can add a javadoc. Was thinking of it. But later slipped out of my mind. Agree with point 3, can change this method as well. If you are reviewing, I can probably wait for more comments and then update the patch. > timestamps are stored unencoded causing parse errors > > > Key: YARN-5109 > URL: https://issues.apache.org/jira/browse/YARN-5109 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena >Priority: Blocker > Labels: yarn-2928-1st-milestone > Attachments: YARN-5109-YARN-2928.003.patch, > YARN-5109-YARN-2928.01.patch, YARN-5109-YARN-2928.02.patch, > YARN-5109-YARN-2928.03.patch, YARN-5109-YARN-2928.04.patch, > YARN-5109-YARN-2928.05.patch, YARN-5109-YARN-2928.06.patch > > > When we store timestamps (for example as part of the row key or part of the > column name for an event), the bytes are used as is without any encoding. If > the byte value happens to contain a separator character we use (e.g. "!" or > "="), it causes a parse failure when we read it. > I came across this while looking into this error in the timeline reader: > {noformat} > 2016-05-17 21:28:38,643 WARN > org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils: > incorrectly formatted column name: it will be discarded > {noformat} > I traced the data that was causing this, and the column name (for the event) > was the following: > {noformat} > i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST > {noformat} > Note that the column name is supposed to be of the format (event > id)=(timestamp)=(event info key). However, observe the timestamp portion: > {noformat} > \x7F\xFF\xFE\xABDY=\x99 > {noformat} > The presence of the separator ("=") causes the parse error. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors
[ https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15299567#comment-15299567 ] Joep Rottinghuis commented on YARN-5109: Overall, the approach and implementation looks really clean and solid. I'll have to look at little further tomorrow, but I have three comments: 1) While reviewing patch for all places where we're converting strings into qualifiers for rows and columns it struck me that in HBaseTimelineWriterImpl.storeInFlowActivityTable We still use GenericObjectMapper to go from a String to a column qualifier. See line 190 (after applying patch): {code} byte[] qualifier = GenericObjectMapper.write(flowRunId); {code} In TimelineEntityReader.parseEntity line 140 we use the regular StringKeyConverter: {code} FlowActivityColumnPrefix.RUN_ID.readResults(result, StringKeyConverter.getInstance()); for (Map.Entrye : runIdsMap.entrySet()) { {code} so presumably the above code should read: {code} byte[] qualifier = StringKeyConverter.getInstance().encode(flowRunId); {code} 2) Somehow encode() and decode() apis sound like they should be symmetric. For StringKeyConverter they are. StringKeyConverter.getInstance.decode(encode(s)).equals.s for any String s (haven't checked null, or ""). Similarly {code} Bytes.equals(b, StringKeyConverter.getInstance.encode(decode(b)) == true {code} for any byte[] b. This isn't true for ApplicationRowKey etc. I understand this is because we use the same encode to create a prefix and I understand that we normally expect all fields to be present when we create an ApplicationRowKey. Somehow it would be nicer to create an applicationRowKey object and then perhaps have a validate method on it, or be able to create one as a prefix (perhaps with a separate constructor). Right now I don't really have a consistent suggestion for this, other than the comment that this somehow seems "unpleasant". If we do decide to keep this current approach (perhaps because alternatives are just as ugly or unpleasant) then we should at least document in the javadoc for the encode and decode methods that the invariant that one might expect from naming doesn't have to hold true for all implementations. We should also update the javadoc in the implementations from a simple @override to a generated javadoc referring to the interface method. I'm not sure if this is Hadoop coding standards, but I've always found it a little silly to methods that implement an interface with @override. For example {code} /* * (non-Javadoc) * * @see * org.apache.hadoop.yarn.server.timelineservice.storage.common.KeyConverter * #encode(java.lang.Object) */ public byte[] encode(EntityRowKey rowKey) { {code} makes more sense to me then {code} @Override public byte[] encode(EntityRowKey rowKey) { {code} but that is probably more a matter of taste... 3) Why don't we do to ColumnHelper.readResultsWithTimestamps what we did to readResults? that would make the method signature: {code} public NavigableMap > readResultsWithTimestamps( Result result, byte[] columnPrefixBytes, KeyConverter keyConverter) throws IOException { {code} With equivalent changes up the call stack in ApplicationColumnPrefix.readResultsWithTimestamps, EntiyColumnPrefix,readResultsWithTimestamps, FlowActivityColumnPrefix.readResultsWithTimestamps, and FlowRunColumnPrefix.readResultsWithTimestamps and all their respective uses. > timestamps are stored unencoded causing parse errors > > > Key: YARN-5109 > URL: https://issues.apache.org/jira/browse/YARN-5109 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena >Priority: Blocker > Labels: yarn-2928-1st-milestone > Attachments: YARN-5109-YARN-2928.003.patch, > YARN-5109-YARN-2928.01.patch, YARN-5109-YARN-2928.02.patch, > YARN-5109-YARN-2928.03.patch, YARN-5109-YARN-2928.04.patch, > YARN-5109-YARN-2928.05.patch, YARN-5109-YARN-2928.06.patch > > > When we store timestamps (for example as part of the row key or part of the > column name for an event), the bytes are used as is without any encoding. If > the byte value happens to contain a separator character we use (e.g. "!" or > "="), it causes a parse failure when we read it. > I came across this while looking into this error in the timeline reader: > {noformat} > 2016-05-17 21:28:38,643 WARN > org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils: > incorrectly formatted column name: it will be discarded > {noformat} > I traced the data that was causing this, and the column name (for the event) > was the following:
[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors
[ https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15299023#comment-15299023 ] Hadoop QA commented on YARN-5109: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 14s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 5 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 22s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 57s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 30s {color} | {color:green} YARN-2928 passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 39s {color} | {color:green} YARN-2928 passed with JDK v1.7.0_101 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 28s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 47s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 33s {color} | {color:green} YARN-2928 passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 13s {color} | {color:red} branch/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests no findbugs output file (hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests/target/findbugsXml.xml) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s {color} | {color:green} YARN-2928 passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 31s {color} | {color:green} YARN-2928 passed with JDK v1.7.0_101 {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 38s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 28s {color} | {color:green} the patch passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 28s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 34s {color} | {color:green} the patch passed with JDK v1.7.0_101 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 34s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 26s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 42s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 29s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 11s {color} | {color:red} patch/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests no findbugs output file (hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests/target/findbugsXml.xml) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 21s {color} | {color:green} the patch passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 28s {color} | {color:green} the patch passed with JDK v1.7.0_101 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 45s {color} | {color:green} hadoop-yarn-server-timelineservice in the patch passed with JDK v1.8.0_91. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 38s {color} | {color:green} hadoop-yarn-server-timelineservice-hbase-tests in the patch passed with JDK v1.8.0_91. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 48s {color} | {color:green}
[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors
[ https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15299022#comment-15299022 ] Hadoop QA commented on YARN-5109: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 12m 11s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 5 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 2m 28s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 48s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 20s {color} | {color:green} YARN-2928 passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 31s {color} | {color:green} YARN-2928 passed with JDK v1.7.0_101 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 31s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 48s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 38s {color} | {color:green} YARN-2928 passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 13s {color} | {color:red} branch/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests no findbugs output file (hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests/target/findbugsXml.xml) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s {color} | {color:green} YARN-2928 passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 31s {color} | {color:green} YARN-2928 passed with JDK v1.7.0_101 {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 37s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 17s {color} | {color:green} the patch passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 17s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 30s {color} | {color:green} the patch passed with JDK v1.7.0_101 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 30s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 26s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 41s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 28s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 11s {color} | {color:red} patch/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests no findbugs output file (hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests/target/findbugsXml.xml) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 22s {color} | {color:green} the patch passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s {color} | {color:green} the patch passed with JDK v1.7.0_101 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 41s {color} | {color:green} hadoop-yarn-server-timelineservice in the patch passed with JDK v1.8.0_91. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 23s {color} | {color:green} hadoop-yarn-server-timelineservice-hbase-tests in the patch passed with JDK v1.8.0_91. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 47s {color} | {color:green}
[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors
[ https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298851#comment-15298851 ] Varun Saxena commented on YARN-5109: Makes sense. Will make it private. > timestamps are stored unencoded causing parse errors > > > Key: YARN-5109 > URL: https://issues.apache.org/jira/browse/YARN-5109 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena >Priority: Blocker > Labels: yarn-2928-1st-milestone > Attachments: YARN-5109-YARN-2928.003.patch, > YARN-5109-YARN-2928.01.patch, YARN-5109-YARN-2928.02.patch, > YARN-5109-YARN-2928.03.patch, YARN-5109-YARN-2928.04.patch, > YARN-5109-YARN-2928.05.patch > > > When we store timestamps (for example as part of the row key or part of the > column name for an event), the bytes are used as is without any encoding. If > the byte value happens to contain a separator character we use (e.g. "!" or > "="), it causes a parse failure when we read it. > I came across this while looking into this error in the timeline reader: > {noformat} > 2016-05-17 21:28:38,643 WARN > org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils: > incorrectly formatted column name: it will be discarded > {noformat} > I traced the data that was causing this, and the column name (for the event) > was the following: > {noformat} > i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST > {noformat} > Note that the column name is supposed to be of the format (event > id)=(timestamp)=(event info key). However, observe the timestamp portion: > {noformat} > \x7F\xFF\xFE\xABDY=\x99 > {noformat} > The presence of the separator ("=") causes the parse error. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors
[ https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298433#comment-15298433 ] Sangjin Lee commented on YARN-5109: --- Thanks for updating the patch [~varun_saxena]! I think it's almost there. Just one thing I noticed (which I should have noticed earlier) is that the *static* {{split()}} methods do not really need to be public as they are exclusively used by the instance {{split()}} methods. In fact, it might not be a good idea for them to be used with an arbitrary separator outside the ones we define here. Can we make the static {{split()}} methods all private? > timestamps are stored unencoded causing parse errors > > > Key: YARN-5109 > URL: https://issues.apache.org/jira/browse/YARN-5109 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena >Priority: Blocker > Labels: yarn-2928-1st-milestone > Attachments: YARN-5109-YARN-2928.003.patch, > YARN-5109-YARN-2928.01.patch, YARN-5109-YARN-2928.02.patch, > YARN-5109-YARN-2928.03.patch, YARN-5109-YARN-2928.04.patch, > YARN-5109-YARN-2928.05.patch > > > When we store timestamps (for example as part of the row key or part of the > column name for an event), the bytes are used as is without any encoding. If > the byte value happens to contain a separator character we use (e.g. "!" or > "="), it causes a parse failure when we read it. > I came across this while looking into this error in the timeline reader: > {noformat} > 2016-05-17 21:28:38,643 WARN > org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils: > incorrectly formatted column name: it will be discarded > {noformat} > I traced the data that was causing this, and the column name (for the event) > was the following: > {noformat} > i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST > {noformat} > Note that the column name is supposed to be of the format (event > id)=(timestamp)=(event info key). However, observe the timestamp portion: > {noformat} > \x7F\xFF\xFE\xABDY=\x99 > {noformat} > The presence of the separator ("=") causes the parse error. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors
[ https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298019#comment-15298019 ] Hadoop QA commented on YARN-5109: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 16s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 5 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 24s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 20s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 20s {color} | {color:green} YARN-2928 passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 31s {color} | {color:green} YARN-2928 passed with JDK v1.7.0_101 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 28s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 47s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 32s {color} | {color:green} YARN-2928 passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 13s {color} | {color:red} branch/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests no findbugs output file (hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests/target/findbugsXml.xml) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s {color} | {color:green} YARN-2928 passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 30s {color} | {color:green} YARN-2928 passed with JDK v1.7.0_101 {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 37s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 14s {color} | {color:green} the patch passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 14s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 30s {color} | {color:green} the patch passed with JDK v1.7.0_101 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 30s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 25s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 41s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 28s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 11s {color} | {color:red} patch/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests no findbugs output file (hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests/target/findbugsXml.xml) {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 13s {color} | {color:red} hadoop-yarn-server-timelineservice in the patch failed with JDK v1.8.0_91. {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 2m 5s {color} | {color:red} hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-timelineservice-jdk1.7.0_101 with JDK v1.7.0_101 generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s {color} | {color:green} the patch passed with JDK v1.7.0_101 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 41s {color} | {color:green} hadoop-yarn-server-timelineservice in the patch passed with JDK v1.8.0_91.
[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors
[ https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15297969#comment-15297969 ] Hadoop QA commented on YARN-5109: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} docker {color} | {color:red} 6m 13s {color} | {color:red} Docker failed to build yetus/hadoop:cf2ee45. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12805841/YARN-5109-YARN-2928.04.patch | | JIRA Issue | YARN-5109 | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/11649/console | | Powered by | Apache Yetus 0.2.0 http://yetus.apache.org | This message was automatically generated. > timestamps are stored unencoded causing parse errors > > > Key: YARN-5109 > URL: https://issues.apache.org/jira/browse/YARN-5109 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena >Priority: Blocker > Labels: yarn-2928-1st-milestone > Attachments: YARN-5109-YARN-2928.003.patch, > YARN-5109-YARN-2928.01.patch, YARN-5109-YARN-2928.02.patch, > YARN-5109-YARN-2928.03.patch, YARN-5109-YARN-2928.04.patch > > > When we store timestamps (for example as part of the row key or part of the > column name for an event), the bytes are used as is without any encoding. If > the byte value happens to contain a separator character we use (e.g. "!" or > "="), it causes a parse failure when we read it. > I came across this while looking into this error in the timeline reader: > {noformat} > 2016-05-17 21:28:38,643 WARN > org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils: > incorrectly formatted column name: it will be discarded > {noformat} > I traced the data that was causing this, and the column name (for the event) > was the following: > {noformat} > i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST > {noformat} > Note that the column name is supposed to be of the format (event > id)=(timestamp)=(event info key). However, observe the timestamp portion: > {noformat} > \x7F\xFF\xFE\xABDY=\x99 > {noformat} > The presence of the separator ("=") causes the parse error. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors
[ https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15297844#comment-15297844 ] Varun Saxena commented on YARN-5109: bq. Also, do we have a test that tests an encoded long having a separator in it? After all, that's what caused us to uncover this issue. Yes, we have. In TestKeyConverters, I am trying to create flow run id and cluster timestamp(in app id) in a manner that will have separators in it. Event column name issue is also simulated. Infact it takes care of the case if QUALIFIER changes in future as well. TestHBaseTimelineStorage#testEventsEscapeTs takes care of issue with event column name in an E2E test case. bq. Should we replace "" with Separator.EMPTY_BYTES? That should be equivalent, right? As such, its not completely equal. We are calling joinEncoded, which takes strings. If we call join, we will have to first encode the string. I anyways added a constant EMPTY_STRING in Separator and using it. bq. I think NO_LIMIT_SPLIT and VARIABLE_SIZE are getting confusing. Since we're using VARIABLE_SIZE for the most part, can we remove NO_LIMIT_SPLIT NO_LIMIT_SPLIT is meant for indicating there is no limit to number of splits returned. VARIABLE_SIZE is used to indicate that size of a segment in split is variable. Anyways we can say VARIABLE_SIZE means not a fixed number of splits as well. Other issues have been fixed. > timestamps are stored unencoded causing parse errors > > > Key: YARN-5109 > URL: https://issues.apache.org/jira/browse/YARN-5109 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena >Priority: Blocker > Labels: yarn-2928-1st-milestone > Attachments: YARN-5109-YARN-2928.003.patch, > YARN-5109-YARN-2928.01.patch, YARN-5109-YARN-2928.02.patch, > YARN-5109-YARN-2928.03.patch > > > When we store timestamps (for example as part of the row key or part of the > column name for an event), the bytes are used as is without any encoding. If > the byte value happens to contain a separator character we use (e.g. "!" or > "="), it causes a parse failure when we read it. > I came across this while looking into this error in the timeline reader: > {noformat} > 2016-05-17 21:28:38,643 WARN > org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils: > incorrectly formatted column name: it will be discarded > {noformat} > I traced the data that was causing this, and the column name (for the event) > was the following: > {noformat} > i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST > {noformat} > Note that the column name is supposed to be of the format (event > id)=(timestamp)=(event info key). However, observe the timestamp portion: > {noformat} > \x7F\xFF\xFE\xABDY=\x99 > {noformat} > The presence of the separator ("=") causes the parse error. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors
[ https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15297099#comment-15297099 ] Sangjin Lee commented on YARN-5109: --- Thanks [~varun_saxena] for the patch! I think it's almost there. I have a few mostly minor comments. It would be great if you could address them. In terms of the package placement of the key converter classes, should they be in the respective packages instead of all in common? For example, {{ApplicationRowKeyConverter}} is really used by the application table classes, so it would be more natural to have it in the application package, and so on. Thoughts? Also, do we have a test that tests an encoded long having a separator in it? After all, that's what caused us to uncover this issue. :) (AppIdKeyConverter.java) - one small suggestion: how about adding a method {{getKeySize()}} to return the expected size of the key so that users of {{AppIdKeyConverter}} do not need to hard-code the size themselves? (FlowActivityRowKeyConverter.java) - l.55: Should we replace "" with {{Separator.EMPTY_BYTES}}? That should be equivalent, right? (FlowRunRowKeyConverter.java) - l.56: same as above (EventColumnName.java) - l.31: the {{super()}} call is superfluous; can we remove it? (ColumnHelper.java) - l.259: nit: typo ({{converteColumnKey}} -> {{converterColumnKey}} or?) (Separator.java) - l.71: I think {{NO_LIMIT_SPLIT}} and {{VARIABLE_SIZE}} are getting confusing. Since we're using {{VARIABLE_SIZE}} for the most part, can we remove {{NO_LIMIT_SPLIT}}? - l.491: I think this now calls {{split(byte[], byte[], int[])}}, not {{split(byte[], byte[], int)}}, which is quite confusing. We should eliminate the ambiguity here, by explicitly calling with the right argument instead of just {{null}}. - we should make {{splitRanges()}} private or package-private > timestamps are stored unencoded causing parse errors > > > Key: YARN-5109 > URL: https://issues.apache.org/jira/browse/YARN-5109 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena >Priority: Blocker > Labels: yarn-2928-1st-milestone > Attachments: YARN-5109-YARN-2928.003.patch, > YARN-5109-YARN-2928.01.patch, YARN-5109-YARN-2928.02.patch, > YARN-5109-YARN-2928.03.patch > > > When we store timestamps (for example as part of the row key or part of the > column name for an event), the bytes are used as is without any encoding. If > the byte value happens to contain a separator character we use (e.g. "!" or > "="), it causes a parse failure when we read it. > I came across this while looking into this error in the timeline reader: > {noformat} > 2016-05-17 21:28:38,643 WARN > org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils: > incorrectly formatted column name: it will be discarded > {noformat} > I traced the data that was causing this, and the column name (for the event) > was the following: > {noformat} > i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST > {noformat} > Note that the column name is supposed to be of the format (event > id)=(timestamp)=(event info key). However, observe the timestamp portion: > {noformat} > \x7F\xFF\xFE\xABDY=\x99 > {noformat} > The presence of the separator ("=") causes the parse error. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors
[ https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15295281#comment-15295281 ] Hadoop QA commented on YARN-5109: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 14s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 5 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 25s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 5s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 15s {color} | {color:green} YARN-2928 passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 31s {color} | {color:green} YARN-2928 passed with JDK v1.7.0_101 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 28s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 46s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 33s {color} | {color:green} YARN-2928 passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 13s {color} | {color:red} branch/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests no findbugs output file (hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests/target/findbugsXml.xml) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s {color} | {color:green} YARN-2928 passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 31s {color} | {color:green} YARN-2928 passed with JDK v1.7.0_101 {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 12s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 36s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 12s {color} | {color:green} the patch passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 12s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 30s {color} | {color:green} the patch passed with JDK v1.7.0_101 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 30s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 25s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 42s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 28s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 11s {color} | {color:red} patch/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests no findbugs output file (hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests/target/findbugsXml.xml) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 21s {color} | {color:green} the patch passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s {color} | {color:green} the patch passed with JDK v1.7.0_101 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 39s {color} | {color:green} hadoop-yarn-server-timelineservice in the patch passed with JDK v1.8.0_91. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 4m 0s {color} | {color:green} hadoop-yarn-server-timelineservice-hbase-tests in the patch passed with JDK v1.8.0_91. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 47s {color} | {color:green}
[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors
[ https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15295257#comment-15295257 ] Hadoop QA commented on YARN-5109: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 14s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 5 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 2m 39s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 32s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 18s {color} | {color:green} YARN-2928 passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 32s {color} | {color:green} YARN-2928 passed with JDK v1.7.0_101 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 33s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 49s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 38s {color} | {color:green} YARN-2928 passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 13s {color} | {color:red} branch/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests no findbugs output file (hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests/target/findbugsXml.xml) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 29s {color} | {color:green} YARN-2928 passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 31s {color} | {color:green} YARN-2928 passed with JDK v1.7.0_101 {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 36s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 13s {color} | {color:green} the patch passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 13s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 30s {color} | {color:green} the patch passed with JDK v1.7.0_101 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 30s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 25s {color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server: patch generated 8 new + 2 unchanged - 0 fixed = 10 total (was 2) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 42s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 28s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 12s {color} | {color:red} patch/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests no findbugs output file (hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests/target/findbugsXml.xml) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 22s {color} | {color:green} the patch passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s {color} | {color:green} the patch passed with JDK v1.7.0_101 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 40s {color} | {color:green} hadoop-yarn-server-timelineservice in the patch passed with JDK v1.8.0_91. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 4m 1s {color} | {color:green} hadoop-yarn-server-timelineservice-hbase-tests in the patch passed with JDK v1.8.0_91. {color} | |
[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors
[ https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15295242#comment-15295242 ] Varun Saxena commented on YARN-5109: Consistently getting build failures on Jenkins with following error {noformat} ERROR: a previous rebase failed. Aborting it. HEAD is now at d491ef0 YARN-3367. Replace starting a separate thread for post entity with event loop in TimelineClient (Naganarasimha G R via sjlee) Switched to branch 'trunk' Your branch is up-to-date with 'origin/trunk'. Current branch trunk is up to date. Switched to branch 'YARN-2928' Your branch and 'origin/YARN-2928' have diverged, and have 85 and 782 different commits each, respectively. (use "git pull" to merge the remote branch into yours) First, rewinding head to replay your work on top of it... Applying: YARN-3063. Bootstrapping TimelineServer next generation module. Contributed by Zhijie Shen. Using index info to reconstruct a base tree... M hadoop-project/pom.xml A hadoop-yarn-project/CHANGES.txt M hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/pom.xml :47: trailing whitespace. Trunk - Unreleased warning: 1 line adds whitespace errors. Falling back to patching base and 3-way merge... Auto-merging hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/pom.xml CONFLICT (content): Merge conflict in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/pom.xml Auto-merging hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/pom.xml CONFLICT (add/add): Merge conflict in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/pom.xml CONFLICT (modify/delete): hadoop-yarn-project/CHANGES.txt deleted in HEAD and modified in YARN-3063. Bootstrapping TimelineServer next generation module. Contributed by Zhijie Shen.. Version YARN-3063. Bootstrapping TimelineServer next generation module. Contributed by Zhijie Shen. of hadoop-yarn-project/CHANGES.txt left in tree. Auto-merging hadoop-project/pom.xml CONFLICT (content): Merge conflict in hadoop-project/pom.xml Failed to merge in the changes. Patch failed at 0001 YARN-3063. Bootstrapping TimelineServer next generation module. Contributed by Zhijie Shen. The copy of the patch that failed is found in: /home/jenkins/jenkins-slave/workspace/PreCommit-YARN-Build/.git/rebase-apply/patch When you have resolved this problem, run "git rebase --continue". If you prefer to skip this patch, run "git rebase --skip" instead. To check out the original branch and stop rebasing, run "git rebase --abort". ERROR: git pull is failing {noformat} Submitted the JIRA manually twice, together so that another build can start before first submission fails. Then it works. It seems workspace on one of the machines has a problem. > timestamps are stored unencoded causing parse errors > > > Key: YARN-5109 > URL: https://issues.apache.org/jira/browse/YARN-5109 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena >Priority: Blocker > Labels: yarn-2928-1st-milestone > Attachments: YARN-5109-YARN-2928.003.patch, > YARN-5109-YARN-2928.01.patch, YARN-5109-YARN-2928.02.patch, > YARN-5109-YARN-2928.03.patch > > > When we store timestamps (for example as part of the row key or part of the > column name for an event), the bytes are used as is without any encoding. If > the byte value happens to contain a separator character we use (e.g. "!" or > "="), it causes a parse failure when we read it. > I came across this while looking into this error in the timeline reader: > {noformat} > 2016-05-17 21:28:38,643 WARN > org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils: > incorrectly formatted column name: it will be discarded > {noformat} > I traced the data that was causing this, and the column name (for the event) > was the following: > {noformat} > i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST > {noformat} > Note that the column name is supposed to be of the format (event > id)=(timestamp)=(event info key). However, observe the timestamp portion: > {noformat} > \x7F\xFF\xFE\xABDY=\x99 > {noformat} > The presence of the separator ("=") causes the parse error. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors
[ https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15295203#comment-15295203 ] Varun Saxena commented on YARN-5109: Updating a patch trying to fix checkstyle issues. > timestamps are stored unencoded causing parse errors > > > Key: YARN-5109 > URL: https://issues.apache.org/jira/browse/YARN-5109 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena >Priority: Blocker > Labels: yarn-2928-1st-milestone > Attachments: YARN-5109-YARN-2928.003.patch, > YARN-5109-YARN-2928.01.patch, YARN-5109-YARN-2928.02.patch, > YARN-5109-YARN-2928.03.patch > > > When we store timestamps (for example as part of the row key or part of the > column name for an event), the bytes are used as is without any encoding. If > the byte value happens to contain a separator character we use (e.g. "!" or > "="), it causes a parse failure when we read it. > I came across this while looking into this error in the timeline reader: > {noformat} > 2016-05-17 21:28:38,643 WARN > org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils: > incorrectly formatted column name: it will be discarded > {noformat} > I traced the data that was causing this, and the column name (for the event) > was the following: > {noformat} > i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST > {noformat} > Note that the column name is supposed to be of the format (event > id)=(timestamp)=(event info key). However, observe the timestamp portion: > {noformat} > \x7F\xFF\xFE\xABDY=\x99 > {noformat} > The presence of the separator ("=") causes the parse error. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors
[ https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15295159#comment-15295159 ] Hadoop QA commented on YARN-5109: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 18s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 5 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 57s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 12s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 22s {color} | {color:green} YARN-2928 passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 32s {color} | {color:green} YARN-2928 passed with JDK v1.7.0_101 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 29s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 46s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 33s {color} | {color:green} YARN-2928 passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 13s {color} | {color:red} branch/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests no findbugs output file (hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests/target/findbugsXml.xml) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s {color} | {color:green} YARN-2928 passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 31s {color} | {color:green} YARN-2928 passed with JDK v1.7.0_101 {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 37s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 18s {color} | {color:green} the patch passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 18s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 29s {color} | {color:green} the patch passed with JDK v1.7.0_101 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 29s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 27s {color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server: patch generated 7 new + 2 unchanged - 0 fixed = 9 total (was 2) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 41s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 28s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 11s {color} | {color:red} patch/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests no findbugs output file (hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests/target/findbugsXml.xml) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 21s {color} | {color:green} the patch passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s {color} | {color:green} the patch passed with JDK v1.7.0_101 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 41s {color} | {color:green} hadoop-yarn-server-timelineservice in the patch passed with JDK v1.8.0_91. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 27s {color} | {color:green} hadoop-yarn-server-timelineservice-hbase-tests in the patch passed with JDK v1.8.0_91. {color} | |
[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors
[ https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15294610#comment-15294610 ] Joep Rottinghuis commented on YARN-5109: Seems sensible. Looking forward to see in context on patch > timestamps are stored unencoded causing parse errors > > > Key: YARN-5109 > URL: https://issues.apache.org/jira/browse/YARN-5109 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena >Priority: Blocker > Labels: yarn-2928-1st-milestone > Attachments: YARN-5109-YARN-2928.01.patch, > YARN-5109-YARN-2928.02.patch > > > When we store timestamps (for example as part of the row key or part of the > column name for an event), the bytes are used as is without any encoding. If > the byte value happens to contain a separator character we use (e.g. "!" or > "="), it causes a parse failure when we read it. > I came across this while looking into this error in the timeline reader: > {noformat} > 2016-05-17 21:28:38,643 WARN > org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils: > incorrectly formatted column name: it will be discarded > {noformat} > I traced the data that was causing this, and the column name (for the event) > was the following: > {noformat} > i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST > {noformat} > Note that the column name is supposed to be of the format (event > id)=(timestamp)=(event info key). However, observe the timestamp portion: > {noformat} > \x7F\xFF\xFE\xABDY=\x99 > {noformat} > The presence of the separator ("=") causes the parse error. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors
[ https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15294603#comment-15294603 ] Varun Saxena commented on YARN-5109: [~sjlee0], yeah this doesnt break anything. I was just curious to know why the different approaches. Above makes sense. > timestamps are stored unencoded causing parse errors > > > Key: YARN-5109 > URL: https://issues.apache.org/jira/browse/YARN-5109 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena >Priority: Blocker > Labels: yarn-2928-1st-milestone > Attachments: YARN-5109-YARN-2928.01.patch, > YARN-5109-YARN-2928.02.patch > > > When we store timestamps (for example as part of the row key or part of the > column name for an event), the bytes are used as is without any encoding. If > the byte value happens to contain a separator character we use (e.g. "!" or > "="), it causes a parse failure when we read it. > I came across this while looking into this error in the timeline reader: > {noformat} > 2016-05-17 21:28:38,643 WARN > org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils: > incorrectly formatted column name: it will be discarded > {noformat} > I traced the data that was causing this, and the column name (for the event) > was the following: > {noformat} > i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST > {noformat} > Note that the column name is supposed to be of the format (event > id)=(timestamp)=(event info key). However, observe the timestamp portion: > {noformat} > \x7F\xFF\xFE\xABDY=\x99 > {noformat} > The presence of the separator ("=") causes the parse error. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors
[ https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15294601#comment-15294601 ] Varun Saxena commented on YARN-5109: [~sjlee0], Yes, type safety will have to be ensured within this encode. The specific function I am talking about is {{TimelineFilterUtils#createFiltersFromColumnQualifiers}}. This is used for events and relations. Event filters and relation filters cannot be applied using HBase SingleColumnValueFilter so we fetch all the columns specified which are there in event filters and relation filters(i.e. events in event filters or entity types in relation filters). I mentioned about {{Object... params}} as that is what came to my mind just before signing off for the day. But on second thoughts, I think we can have a switch case based on column prefix and construct EventColumnName from there. We will have only 2 switch cases here other than default(i.e. ApplicationColumnPrefix.EVENT and EntityColumnPrefix.EVENT). The number of cases should not become humongous in this switch case even from a long term perspective. And if it does, we can revisit on a solution then. I will go with this approach now. > timestamps are stored unencoded causing parse errors > > > Key: YARN-5109 > URL: https://issues.apache.org/jira/browse/YARN-5109 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena >Priority: Blocker > Labels: yarn-2928-1st-milestone > Attachments: YARN-5109-YARN-2928.01.patch, > YARN-5109-YARN-2928.02.patch > > > When we store timestamps (for example as part of the row key or part of the > column name for an event), the bytes are used as is without any encoding. If > the byte value happens to contain a separator character we use (e.g. "!" or > "="), it causes a parse failure when we read it. > I came across this while looking into this error in the timeline reader: > {noformat} > 2016-05-17 21:28:38,643 WARN > org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils: > incorrectly formatted column name: it will be discarded > {noformat} > I traced the data that was causing this, and the column name (for the event) > was the following: > {noformat} > i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST > {noformat} > Note that the column name is supposed to be of the format (event > id)=(timestamp)=(event info key). However, observe the timestamp portion: > {noformat} > \x7F\xFF\xFE\xABDY=\x99 > {noformat} > The presence of the separator ("=") causes the parse error. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors
[ https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15294600#comment-15294600 ] Joep Rottinghuis commented on YARN-5109: Agreed with [~sjlee0] that we'd like to avoid non-type safe conversions. Would like to see where exactly the challenge lies indeed. Perhaps the filters can be parameterized. Hard to say without understanding the exact use-case. I'm sure this ends up as a non-trivial refactor considering the various cases, prefix or not, compound column keys, or strings, rowkeys, and on top of that filters... > timestamps are stored unencoded causing parse errors > > > Key: YARN-5109 > URL: https://issues.apache.org/jira/browse/YARN-5109 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena >Priority: Blocker > Labels: yarn-2928-1st-milestone > Attachments: YARN-5109-YARN-2928.01.patch, > YARN-5109-YARN-2928.02.patch > > > When we store timestamps (for example as part of the row key or part of the > column name for an event), the bytes are used as is without any encoding. If > the byte value happens to contain a separator character we use (e.g. "!" or > "="), it causes a parse failure when we read it. > I came across this while looking into this error in the timeline reader: > {noformat} > 2016-05-17 21:28:38,643 WARN > org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils: > incorrectly formatted column name: it will be discarded > {noformat} > I traced the data that was causing this, and the column name (for the event) > was the following: > {noformat} > i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST > {noformat} > Note that the column name is supposed to be of the format (event > id)=(timestamp)=(event info key). However, observe the timestamp portion: > {noformat} > \x7F\xFF\xFE\xABDY=\x99 > {noformat} > The presence of the separator ("=") causes the parse error. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors
[ https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15294595#comment-15294595 ] Joep Rottinghuis commented on YARN-5109: bq. The main motivation for reversing the user and the cluster in the entity table is to accommodate the fact that the table can get real large and we wanted to provide good partitioning by using the user dimension rather than the cluster dimension. We preserved the original order (cluster and then user) for the application table. Indeed. Aside from sheer size, which would be roughly equal in both cases, we especially want to avoid hotspotting during writes with a very high update volume. Even if we can keep the total size under control with an expiration policy, fact remains that if there is a specifically large cluster (and/or just one cluster) the cluster prefix doesn't really help spread the load. If the user is first, we at least "salt" the key with the user, so the load gets spread across the various users. Of course the same issue could happen the other way around, if somebody runs many clusters and all of them they run jobs emitting entities (with metric time series data) as one single user (let's say "hadoop"), we'd still hotspot. The volume for the application table _should_ be smaller. In a multi-cluster setup, the load would also spread there. Range scans per cluster would be more efficient over the application table, at the cost of reduced parallelism during writes. bq. The bottom line is that since this was the intended design and nothing is broken, we should not revisit it as part of this JIRA. Let me know if that is OK with you guys. +1 agreed. > timestamps are stored unencoded causing parse errors > > > Key: YARN-5109 > URL: https://issues.apache.org/jira/browse/YARN-5109 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena >Priority: Blocker > Labels: yarn-2928-1st-milestone > Attachments: YARN-5109-YARN-2928.01.patch, > YARN-5109-YARN-2928.02.patch > > > When we store timestamps (for example as part of the row key or part of the > column name for an event), the bytes are used as is without any encoding. If > the byte value happens to contain a separator character we use (e.g. "!" or > "="), it causes a parse failure when we read it. > I came across this while looking into this error in the timeline reader: > {noformat} > 2016-05-17 21:28:38,643 WARN > org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils: > incorrectly formatted column name: it will be discarded > {noformat} > I traced the data that was causing this, and the column name (for the event) > was the following: > {noformat} > i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST > {noformat} > Note that the column name is supposed to be of the format (event > id)=(timestamp)=(event info key). However, observe the timestamp portion: > {noformat} > \x7F\xFF\xFE\xABDY=\x99 > {noformat} > The presence of the separator ("=") causes the parse error. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors
[ https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15294559#comment-15294559 ] Sangjin Lee commented on YARN-5109: --- Hmm, could you point to the specific method where the proposed {{byte[] encode(T key)}} would not work and we would need one that takes an {{Object}} array? That's bit worrisome as {{Object... params}} is not really type-safe, and it can be error-prone. > timestamps are stored unencoded causing parse errors > > > Key: YARN-5109 > URL: https://issues.apache.org/jira/browse/YARN-5109 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena >Priority: Blocker > Labels: yarn-2928-1st-milestone > Attachments: YARN-5109-YARN-2928.01.patch, > YARN-5109-YARN-2928.02.patch > > > When we store timestamps (for example as part of the row key or part of the > column name for an event), the bytes are used as is without any encoding. If > the byte value happens to contain a separator character we use (e.g. "!" or > "="), it causes a parse failure when we read it. > I came across this while looking into this error in the timeline reader: > {noformat} > 2016-05-17 21:28:38,643 WARN > org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils: > incorrectly formatted column name: it will be discarded > {noformat} > I traced the data that was causing this, and the column name (for the event) > was the following: > {noformat} > i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST > {noformat} > Note that the column name is supposed to be of the format (event > id)=(timestamp)=(event info key). However, observe the timestamp portion: > {noformat} > \x7F\xFF\xFE\xABDY=\x99 > {noformat} > The presence of the separator ("=") causes the parse error. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors
[ https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15294555#comment-15294555 ] Sangjin Lee commented on YARN-5109: --- [~jrottinghuis], [~vrushalic], and I dug a little bit, and it appears to be intentional. See YARN-3906 and YARN-3815 (see [the attached doc|https://issues.apache.org/jira/secure/attachment/12743391/hbase-schema-proposal-for-aggregation.pdf]). The main motivation for reversing the user and the cluster in the entity table is to accommodate the fact that the table can get real large and we wanted to provide good partitioning by using the user dimension rather than the cluster dimension. We preserved the original order (cluster and then user) for the application table. The bottom line is that since this was the intended design and nothing is broken, we should not revisit it as part of this JIRA. Let me know if that is OK with you guys. > timestamps are stored unencoded causing parse errors > > > Key: YARN-5109 > URL: https://issues.apache.org/jira/browse/YARN-5109 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena >Priority: Blocker > Labels: yarn-2928-1st-milestone > Attachments: YARN-5109-YARN-2928.01.patch, > YARN-5109-YARN-2928.02.patch > > > When we store timestamps (for example as part of the row key or part of the > column name for an event), the bytes are used as is without any encoding. If > the byte value happens to contain a separator character we use (e.g. "!" or > "="), it causes a parse failure when we read it. > I came across this while looking into this error in the timeline reader: > {noformat} > 2016-05-17 21:28:38,643 WARN > org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils: > incorrectly formatted column name: it will be discarded > {noformat} > I traced the data that was causing this, and the column name (for the event) > was the following: > {noformat} > i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST > {noformat} > Note that the column name is supposed to be of the format (event > id)=(timestamp)=(event info key). However, observe the timestamp portion: > {noformat} > \x7F\xFF\xFE\xABDY=\x99 > {noformat} > The presence of the separator ("=") causes the parse error. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors
[ https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15294277#comment-15294277 ] Varun Saxena commented on YARN-5109: Infact even getColumnPrefixBytes would not work for the code in TimelineFilterUtils as we would not know which class the converter will take(say, EventColumnName or String). So probably we would need another method in converter which takes multiple Objects as parameters and interprets them in sequence and either encodes them or returns a key. Something like below. Then we can either pass a encoded byte array or the Object which needs to be encoded to ColumnPrefix to attach a prefix in front of the qualifier. {code} public interface KeyConverter { byte[] encode(Object... params); OR T createKey(Object...params); byte[] encode(T key); T decode(byte[] bytes); } {code} Will look at it tomorrow. > timestamps are stored unencoded causing parse errors > > > Key: YARN-5109 > URL: https://issues.apache.org/jira/browse/YARN-5109 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena >Priority: Blocker > Labels: yarn-2928-1st-milestone > Attachments: YARN-5109-YARN-2928.01.patch, > YARN-5109-YARN-2928.02.patch > > > When we store timestamps (for example as part of the row key or part of the > column name for an event), the bytes are used as is without any encoding. If > the byte value happens to contain a separator character we use (e.g. "!" or > "="), it causes a parse failure when we read it. > I came across this while looking into this error in the timeline reader: > {noformat} > 2016-05-17 21:28:38,643 WARN > org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils: > incorrectly formatted column name: it will be discarded > {noformat} > I traced the data that was causing this, and the column name (for the event) > was the following: > {noformat} > i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST > {noformat} > Note that the column name is supposed to be of the format (event > id)=(timestamp)=(event info key). However, observe the timestamp portion: > {noformat} > \x7F\xFF\xFE\xABDY=\x99 > {noformat} > The presence of the separator ("=") causes the parse error. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors
[ https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15294198#comment-15294198 ] Varun Saxena commented on YARN-5109: Just to update, the patch is almost complete. Will have it up by > timestamps are stored unencoded causing parse errors > > > Key: YARN-5109 > URL: https://issues.apache.org/jira/browse/YARN-5109 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena >Priority: Blocker > Labels: yarn-2928-1st-milestone > Attachments: YARN-5109-YARN-2928.01.patch, > YARN-5109-YARN-2928.02.patch > > > When we store timestamps (for example as part of the row key or part of the > column name for an event), the bytes are used as is without any encoding. If > the byte value happens to contain a separator character we use (e.g. "!" or > "="), it causes a parse failure when we read it. > I came across this while looking into this error in the timeline reader: > {noformat} > 2016-05-17 21:28:38,643 WARN > org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils: > incorrectly formatted column name: it will be discarded > {noformat} > I traced the data that was causing this, and the column name (for the event) > was the following: > {noformat} > i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST > {noformat} > Note that the column name is supposed to be of the format (event > id)=(timestamp)=(event info key). However, observe the timestamp portion: > {noformat} > \x7F\xFF\xFE\xABDY=\x99 > {noformat} > The presence of the separator ("=") causes the parse error. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors
[ https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15293975#comment-15293975 ] Hadoop QA commented on YARN-5109: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 21s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 47s {color} | {color:green} YARN-2928 passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 49s {color} | {color:green} YARN-2928 passed with JDK v1.7.0_101 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 29s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 51s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 37s {color} | {color:green} YARN-2928 passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 15s {color} | {color:red} branch/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests no findbugs output file (hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests/target/findbugsXml.xml) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 29s {color} | {color:green} YARN-2928 passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 32s {color} | {color:green} YARN-2928 passed with JDK v1.7.0_101 {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 43s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 38s {color} | {color:green} the patch passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 38s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 43s {color} | {color:green} the patch passed with JDK v1.7.0_101 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 43s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 29s {color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server: patch generated 19 new + 2 unchanged - 0 fixed = 21 total (was 2) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 50s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 31s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 13s {color} | {color:red} patch/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests no findbugs output file (hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests/target/findbugsXml.xml) {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 1m 38s {color} | {color:red} hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-timelineservice-jdk1.8.0_91 with JDK v1.8.0_91 generated 10 new + 0 unchanged - 0 fixed = 10 total (was 0) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s {color} | {color:green} the patch passed with JDK v1.8.0_91 {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 2m 28s {color} | {color:red} hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-timelineservice-jdk1.7.0_101 with JDK v1.7.0_101 generated 2 new + 0 unchanged - 0 fixed = 2 total
[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors
[ https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15293713#comment-15293713 ] Sangjin Lee commented on YARN-5109: --- The {{ApplicationTable}} and {{EntityTable}} javadoc also reflect it, so it appears to be intentional, but I may be forgetting something. > timestamps are stored unencoded causing parse errors > > > Key: YARN-5109 > URL: https://issues.apache.org/jira/browse/YARN-5109 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena >Priority: Blocker > Labels: yarn-2928-1st-milestone > Attachments: YARN-5109-YARN-2928.01.patch, > YARN-5109-YARN-2928.02.patch > > > When we store timestamps (for example as part of the row key or part of the > column name for an event), the bytes are used as is without any encoding. If > the byte value happens to contain a separator character we use (e.g. "!" or > "="), it causes a parse failure when we read it. > I came across this while looking into this error in the timeline reader: > {noformat} > 2016-05-17 21:28:38,643 WARN > org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils: > incorrectly formatted column name: it will be discarded > {noformat} > I traced the data that was causing this, and the column name (for the event) > was the following: > {noformat} > i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST > {noformat} > Note that the column name is supposed to be of the format (event > id)=(timestamp)=(event info key). However, observe the timestamp portion: > {noformat} > \x7F\xFF\xFE\xABDY=\x99 > {noformat} > The presence of the separator ("=") causes the parse error. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors
[ https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15293705#comment-15293705 ] Sangjin Lee commented on YARN-5109: --- Hmm, I don't remember there was a reason the application row key had to be different from the entity row key. Maybe I'm forgetting something. [~vrushalic], [~jrottinghuis], did we intentionally have the application row key structure different than the entity row key structure, or is this my error? > timestamps are stored unencoded causing parse errors > > > Key: YARN-5109 > URL: https://issues.apache.org/jira/browse/YARN-5109 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena >Priority: Blocker > Labels: yarn-2928-1st-milestone > Attachments: YARN-5109-YARN-2928.01.patch, > YARN-5109-YARN-2928.02.patch > > > When we store timestamps (for example as part of the row key or part of the > column name for an event), the bytes are used as is without any encoding. If > the byte value happens to contain a separator character we use (e.g. "!" or > "="), it causes a parse failure when we read it. > I came across this while looking into this error in the timeline reader: > {noformat} > 2016-05-17 21:28:38,643 WARN > org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils: > incorrectly formatted column name: it will be discarded > {noformat} > I traced the data that was causing this, and the column name (for the event) > was the following: > {noformat} > i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST > {noformat} > Note that the column name is supposed to be of the format (event > id)=(timestamp)=(event info key). However, observe the timestamp portion: > {noformat} > \x7F\xFF\xFE\xABDY=\x99 > {noformat} > The presence of the separator ("=") causes the parse error. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors
[ https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15293666#comment-15293666 ] Varun Saxena commented on YARN-5109: [~sjlee0], any reason we have clusterid followed by user id in application row key and other way round for entity row key. Just noticed while coding. > timestamps are stored unencoded causing parse errors > > > Key: YARN-5109 > URL: https://issues.apache.org/jira/browse/YARN-5109 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena >Priority: Blocker > Labels: yarn-2928-1st-milestone > Attachments: YARN-5109-YARN-2928.01.patch, > YARN-5109-YARN-2928.02.patch > > > When we store timestamps (for example as part of the row key or part of the > column name for an event), the bytes are used as is without any encoding. If > the byte value happens to contain a separator character we use (e.g. "!" or > "="), it causes a parse failure when we read it. > I came across this while looking into this error in the timeline reader: > {noformat} > 2016-05-17 21:28:38,643 WARN > org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils: > incorrectly formatted column name: it will be discarded > {noformat} > I traced the data that was causing this, and the column name (for the event) > was the following: > {noformat} > i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST > {noformat} > Note that the column name is supposed to be of the format (event > id)=(timestamp)=(event info key). However, observe the timestamp portion: > {noformat} > \x7F\xFF\xFE\xABDY=\x99 > {noformat} > The presence of the separator ("=") causes the parse error. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors
[ https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15293464#comment-15293464 ] Varun Saxena commented on YARN-5109: Yes I have taken that patch and working on top of it. > timestamps are stored unencoded causing parse errors > > > Key: YARN-5109 > URL: https://issues.apache.org/jira/browse/YARN-5109 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena >Priority: Blocker > Labels: yarn-2928-1st-milestone > Attachments: YARN-5109-YARN-2928.01.patch, > YARN-5109-YARN-2928.02.patch > > > When we store timestamps (for example as part of the row key or part of the > column name for an event), the bytes are used as is without any encoding. If > the byte value happens to contain a separator character we use (e.g. "!" or > "="), it causes a parse failure when we read it. > I came across this while looking into this error in the timeline reader: > {noformat} > 2016-05-17 21:28:38,643 WARN > org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils: > incorrectly formatted column name: it will be discarded > {noformat} > I traced the data that was causing this, and the column name (for the event) > was the following: > {noformat} > i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST > {noformat} > Note that the column name is supposed to be of the format (event > id)=(timestamp)=(event info key). However, observe the timestamp portion: > {noformat} > \x7F\xFF\xFE\xABDY=\x99 > {noformat} > The presence of the separator ("=") causes the parse error. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors
[ https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15293458#comment-15293458 ] Joep Rottinghuis commented on YARN-5109: The code in the patch I attached compiles and clears unit tests so by itself should be good to go modulo the items left to do as listed above. > timestamps are stored unencoded causing parse errors > > > Key: YARN-5109 > URL: https://issues.apache.org/jira/browse/YARN-5109 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena >Priority: Blocker > Labels: yarn-2928-1st-milestone > Attachments: YARN-5109-YARN-2928.01.patch, > YARN-5109-YARN-2928.02.patch > > > When we store timestamps (for example as part of the row key or part of the > column name for an event), the bytes are used as is without any encoding. If > the byte value happens to contain a separator character we use (e.g. "!" or > "="), it causes a parse failure when we read it. > I came across this while looking into this error in the timeline reader: > {noformat} > 2016-05-17 21:28:38,643 WARN > org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils: > incorrectly formatted column name: it will be discarded > {noformat} > I traced the data that was causing this, and the column name (for the event) > was the following: > {noformat} > i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST > {noformat} > Note that the column name is supposed to be of the format (event > id)=(timestamp)=(event info key). However, observe the timestamp portion: > {noformat} > \x7F\xFF\xFE\xABDY=\x99 > {noformat} > The presence of the separator ("=") causes the parse error. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors
[ https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15293028#comment-15293028 ] Varun Saxena commented on YARN-5109: Thanks Sangjin and Joep for the pseudocode and prototype. Now I can clearly get what both of you were alluding to in the meeting. On the face of it, this should work in all the cases. Will check this in detail and hopefully have a concrete patch soon. > timestamps are stored unencoded causing parse errors > > > Key: YARN-5109 > URL: https://issues.apache.org/jira/browse/YARN-5109 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena >Priority: Blocker > Labels: yarn-2928-1st-milestone > Attachments: YARN-5109-YARN-2928.01.patch, > YARN-5109-YARN-2928.02.patch > > > When we store timestamps (for example as part of the row key or part of the > column name for an event), the bytes are used as is without any encoding. If > the byte value happens to contain a separator character we use (e.g. "!" or > "="), it causes a parse failure when we read it. > I came across this while looking into this error in the timeline reader: > {noformat} > 2016-05-17 21:28:38,643 WARN > org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils: > incorrectly formatted column name: it will be discarded > {noformat} > I traced the data that was causing this, and the column name (for the event) > was the following: > {noformat} > i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST > {noformat} > Note that the column name is supposed to be of the format (event > id)=(timestamp)=(event info key). However, observe the timestamp portion: > {noformat} > \x7F\xFF\xFE\xABDY=\x99 > {noformat} > The presence of the separator ("=") causes the parse error. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors
[ https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15292819#comment-15292819 ] Joep Rottinghuis commented on YARN-5109: [~varun_saxena] I'm going to briefly assign this jira to me so that I can upload a new patch, after uploading it, I'll assign it back to you. It seems that when you're not an assignee, you cannot attach a patch. Perhaps one of the security measures they have taken I guess. > timestamps are stored unencoded causing parse errors > > > Key: YARN-5109 > URL: https://issues.apache.org/jira/browse/YARN-5109 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena >Priority: Blocker > Labels: yarn-2928-1st-milestone > Attachments: YARN-5109-YARN-2928.01.patch > > > When we store timestamps (for example as part of the row key or part of the > column name for an event), the bytes are used as is without any encoding. If > the byte value happens to contain a separator character we use (e.g. "!" or > "="), it causes a parse failure when we read it. > I came across this while looking into this error in the timeline reader: > {noformat} > 2016-05-17 21:28:38,643 WARN > org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils: > incorrectly formatted column name: it will be discarded > {noformat} > I traced the data that was causing this, and the column name (for the event) > was the following: > {noformat} > i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST > {noformat} > Note that the column name is supposed to be of the format (event > id)=(timestamp)=(event info key). However, observe the timestamp portion: > {noformat} > \x7F\xFF\xFE\xABDY=\x99 > {noformat} > The presence of the separator ("=") causes the parse error. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors
[ https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15292696#comment-15292696 ] Joep Rottinghuis commented on YARN-5109: I have an almost working prototype where we can combined readResults and readResultsHavingCompoundColumnQualifiers methods. It also fixes a case where we do not properly decode spaces in a column name. Aside from that it removes the (false) assumptions that if a prefix is null then the column qualifiers cannot be compound. Column qualifiers being compound or not are separate from prefixes being null or not. Also removing the cross-dependency between Separator and TimelineStorageUtils. Will upload patch (in whatever state I have it) when I log off for tonight. > timestamps are stored unencoded causing parse errors > > > Key: YARN-5109 > URL: https://issues.apache.org/jira/browse/YARN-5109 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena >Priority: Blocker > Labels: yarn-2928-1st-milestone > Attachments: YARN-5109-YARN-2928.01.patch > > > When we store timestamps (for example as part of the row key or part of the > column name for an event), the bytes are used as is without any encoding. If > the byte value happens to contain a separator character we use (e.g. "!" or > "="), it causes a parse failure when we read it. > I came across this while looking into this error in the timeline reader: > {noformat} > 2016-05-17 21:28:38,643 WARN > org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils: > incorrectly formatted column name: it will be discarded > {noformat} > I traced the data that was causing this, and the column name (for the event) > was the following: > {noformat} > i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST > {noformat} > Note that the column name is supposed to be of the format (event > id)=(timestamp)=(event info key). However, observe the timestamp portion: > {noformat} > \x7F\xFF\xFE\xABDY=\x99 > {noformat} > The presence of the separator ("=") causes the parse error. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors
[ https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15292270#comment-15292270 ] Sangjin Lee commented on YARN-5109: --- Another related issue I see is currently the event id and the info key are not encoded as they are joined. I think this is an opportunity to address that. Although it is rather unlikely the event id or info key would contain "=", but we should be safe than sorry... > timestamps are stored unencoded causing parse errors > > > Key: YARN-5109 > URL: https://issues.apache.org/jira/browse/YARN-5109 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena >Priority: Blocker > Labels: yarn-2928-1st-milestone > Attachments: YARN-5109-YARN-2928.01.patch > > > When we store timestamps (for example as part of the row key or part of the > column name for an event), the bytes are used as is without any encoding. If > the byte value happens to contain a separator character we use (e.g. "!" or > "="), it causes a parse failure when we read it. > I came across this while looking into this error in the timeline reader: > {noformat} > 2016-05-17 21:28:38,643 WARN > org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils: > incorrectly formatted column name: it will be discarded > {noformat} > I traced the data that was causing this, and the column name (for the event) > was the following: > {noformat} > i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST > {noformat} > Note that the column name is supposed to be of the format (event > id)=(timestamp)=(event info key). However, observe the timestamp portion: > {noformat} > \x7F\xFF\xFE\xABDY=\x99 > {noformat} > The presence of the separator ("=") causes the parse error. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors
[ https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15292229#comment-15292229 ] Sangjin Lee commented on YARN-5109: --- Joep, Vrushali, and I discussed some more offline, and we realized that there can be a reasonable pattern we can extract based on the individual {{*RowKey.getRowKey()}} and {{*RowKey.parseRowKey()}}, and extend it to cover the compound column names also. We could have an interface that encapsulates the concern of defining the key structure and how to encode/decode it. We would still need the new {{split()}} method to dictate the low-level parsing. The following is a proof-of-concept pseudo code I put together in a hurry: {code} public interface CompoundKeyConverter { byte[] encodeKey(T t); T decodeKey(byte[] b); int[] sizesOfTokens(); // it's debatable whether this should be part of the interface } public class ApplicationRowKeyConverter implements CompoundKeyConverter { public static final int[] SIZES = {0, 0, 0, Long.BYTES, Long.BYTES + Integer.BYTES}; public int[] sizesOfTokens() { return SIZES; } // almost identical to ApplicationRowKey.getRowKey() public byte[] encodeKey(ApplicationRowKey key) { byte[] first = Bytes.toBytes(Separator.QUALIFIERS.joinEncoded(key.getClusterId(), key.getUserId(), key.getFlowName())); // Note that flowRunId is a long, so we can't encode them all at the same // time. byte[] second = Bytes.toBytes(TimelineStorageUtils.invertLong(key.getFlowRunId())); byte[] third = TimelineStorageUtils.encodeAppId(key.getAppId()); return Separator.QUALIFIERS.join(first, second, third); } // almost identical to ApplicationRowKey.parseRowKey() public ApplicationRowKey decodeKey(byte[] data) { // use the new version of the split method that takes the sizes byte[][] rowKeyComponents = Separator.QUALIFIERS.split(data, sizesOfTokens()); if (rowKeyComponents.length < 5) { throw new IllegalArgumentException("the row key is not valid for " + "an application"); } String clusterId = Separator.QUALIFIERS.decode(Bytes.toString(rowKeyComponents[0])); String userId = Separator.QUALIFIERS.decode(Bytes.toString(rowKeyComponents[1])); String flowName = Separator.QUALIFIERS.decode(Bytes.toString(rowKeyComponents[2])); long flowRunId = TimelineStorageUtils.invertLong(Bytes.toLong(rowKeyComponents[3])); String appId = TimelineStorageUtils.decodeAppId(rowKeyComponents[4]); return new ApplicationRowKey(clusterId, userId, flowName, flowRunId, appId); } } public class EventColumnName { private final String id; private final long timestamp; private final String infoKey; public EventColumnName(String id, long timestamp, String infoKey) { this.id = id; this.timestamp = timestamp; this.infoKey = infoKey; } public String getId() { return id; } public long getTimestamp() { return timestamp; } public String getInfoKey() { return infoKey; } } public class EventColumnNameConverter implements CompoundKeyConverter { private static final int[] SIZES = {0, Long.BYTES, 0}; public int[] sizesOfTokens() { return SIZES; } // from HBaseTimelineWriterImpl.storeEvents() public byte[] encodeKey(EventColumnName q) { return ColumnHelper.getCompoundColumnQualifierBytes(q.getId(), Bytes.toBytes(TimelineStorageUtils.invertLong(q.getTimestamp())), Bytes.toBytes(infoKey)); } public EventColumnName decodeKey(byte[] data) { // use the new split() method that takes the sizes byte[][] components = Separator.VALUES.split(data, sizesOfTokens()); if (components.length != 3) { throw new IllegalArgumentException("the column name is not valid"); } String id = Bytes.toString(components[0]); long ts = TimelineStorageUtils.invertLong(Bytes.toLong(components[1])); String infoKey = components[2].length == 0 ? null : Bytes.toString(components[2]); return new EventColumnName(id, ts, infoKey); } } HBaseTimelineWriterImpl.java: private void storeEvents(...) { ... CompoundKeyConverter converter = new EventColumnNameConverter(); EventColumnName e = new EventColumnName(eventId, eventTimestamp, info.getKey()); byte[] compoundColumnQualifierBytes = converter.encodeKey(e); ... } TimelineStorageUtils.java: public static void readEvents(...) { ... CompoundKeyConverter converter = new EventColumnNameConverter(); // ColumnPrefix API should change as well MapeventsResult = prefix.readResultsHavingCompoundColumnQualifiers(result, converter); for (Map.Entry eventResult : eventsResult.entrySet()) { ... } } ColumnHelper.java: public Map readResultsHavingCompoundColumnQualifiers(Result result,
[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors
[ https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15291983#comment-15291983 ] Varun Saxena commented on YARN-5109: Writing down what we discussed in the meeting with regards to this. I have attached a WIP patch of whatever I had done so far. Some things may have to be fine tuned(null checks etc.). Probably both limit and sizes need not be passed into splitRanges. But anyways this works well for row keys as per my tests. For column qualifiers,we could either do encoding of bytes(representing longs, ints, app ids') before writing them into backend or use the same approach which we adopted for row keys i.e. do not encode but split by ignoring separators for longs, ints, etc. till they are fully read. I had a patch for former too but consensus seems to be towards latter. Sangjin and Joep thought that we should reorganize our row key and column qualifier parsing logic into a set of converters. It was decided that this approach will be explored further. > timestamps are stored unencoded causing parse errors > > > Key: YARN-5109 > URL: https://issues.apache.org/jira/browse/YARN-5109 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena >Priority: Blocker > Labels: yarn-2928-1st-milestone > Attachments: YARN-5109-YARN-2928.01.patch > > > When we store timestamps (for example as part of the row key or part of the > column name for an event), the bytes are used as is without any encoding. If > the byte value happens to contain a separator character we use (e.g. "!" or > "="), it causes a parse failure when we read it. > I came across this while looking into this error in the timeline reader: > {noformat} > 2016-05-17 21:28:38,643 WARN > org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils: > incorrectly formatted column name: it will be discarded > {noformat} > I traced the data that was causing this, and the column name (for the event) > was the following: > {noformat} > i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST > {noformat} > Note that the column name is supposed to be of the format (event > id)=(timestamp)=(event info key). However, observe the timestamp portion: > {noformat} > \x7F\xFF\xFE\xABDY=\x99 > {noformat} > The presence of the separator ("=") causes the parse error. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors
[ https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15291529#comment-15291529 ] Varun Saxena commented on YARN-5109: Sure. Lets discuss this in the meeting. Basically column qualifiers are read in ColumnHelper, which is a common piece of code, completely unaware of details of specific column qualifiers. So we may have to pass this size information somehow to ColumnHelper but this will increase the constructor size. Or we can consider going with encoding bytes. Wasn't considering escaping existing patterns though, which more than string maybe a problem with longs though in terms of possibility of appearing. > timestamps are stored unencoded causing parse errors > > > Key: YARN-5109 > URL: https://issues.apache.org/jira/browse/YARN-5109 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena >Priority: Blocker > Labels: yarn-2928-1st-milestone > > When we store timestamps (for example as part of the row key or part of the > column name for an event), the bytes are used as is without any encoding. If > the byte value happens to contain a separator character we use (e.g. "!" or > "="), it causes a parse failure when we read it. > I came across this while looking into this error in the timeline reader: > {noformat} > 2016-05-17 21:28:38,643 WARN > org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils: > incorrectly formatted column name: it will be discarded > {noformat} > I traced the data that was causing this, and the column name (for the event) > was the following: > {noformat} > i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST > {noformat} > Note that the column name is supposed to be of the format (event > id)=(timestamp)=(event info key). However, observe the timestamp portion: > {noformat} > \x7F\xFF\xFE\xABDY=\x99 > {noformat} > The presence of the separator ("=") causes the parse error. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors
[ https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15291498#comment-15291498 ] Sangjin Lee commented on YARN-5109: --- Good point about the column qualifiers not needing the correct order. One other complication with the column qualifiers is that the call chain is several levels. The changes to use the new {{split()}} method there would be bit bigger. If we were to go the route of encoding the bytes as well, there is one other issue we need to be mindful of. We need to guard against the occurrences of the encoded equivalent in the original bytes. For example, "=" would be encoded into "%1$". A problem would arise if the original bytes already contained "%1$" however unlikely that may be. Consider the following original bytes (totally made up with ascii characters): {noformat} t=h%1$ig {noformat} If we simply encode "=", then we get {noformat} t%1$h%1$ig {noformat} Now, if we read this back and decode it, we would decode it to {noformat} t=h=ig {noformat} To do this properly, we'd need to "escape" the existing patterns *before* encoding for the separator. The reverse should be done when decoding it. To be clear, this is an existing issue (even with strings). We went ahead without treating for this as we felt that this is unlikely to occur in a string. But if we're going to revisit encoding, we might want to address that as well. We can discuss the details offline if needed. > timestamps are stored unencoded causing parse errors > > > Key: YARN-5109 > URL: https://issues.apache.org/jira/browse/YARN-5109 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena >Priority: Blocker > Labels: yarn-2928-1st-milestone > > When we store timestamps (for example as part of the row key or part of the > column name for an event), the bytes are used as is without any encoding. If > the byte value happens to contain a separator character we use (e.g. "!" or > "="), it causes a parse failure when we read it. > I came across this while looking into this error in the timeline reader: > {noformat} > 2016-05-17 21:28:38,643 WARN > org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils: > incorrectly formatted column name: it will be discarded > {noformat} > I traced the data that was causing this, and the column name (for the event) > was the following: > {noformat} > i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST > {noformat} > Note that the column name is supposed to be of the format (event > id)=(timestamp)=(event info key). However, observe the timestamp portion: > {noformat} > \x7F\xFF\xFE\xABDY=\x99 > {noformat} > The presence of the separator ("=") causes the parse error. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors
[ https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15290991#comment-15290991 ] Varun Saxena commented on YARN-5109: Thanks [~sjlee0]. I did realize this issue with timestamps in row keys. Was trying a way around for this kind of scenario with row keys by handling it on parse side in individual row key classes(i.e. do not encode and on read side consider separator as part of long/int, if i hasnt been read as yet). But your suggestion of split looks great. This reduces the changes. Will code according to this. For column qualifier though, this change isnt required and we can actually do encoding because for columns we will either have single column value filter or qualifier filter with prefixes applied. If I am not wrong, we do not need to preserve ordering for column qualifiers. However, we can keep it consistent because split method can be changed as per your suggestion. Will remove the encoding bytes routine then. > timestamps are stored unencoded causing parse errors > > > Key: YARN-5109 > URL: https://issues.apache.org/jira/browse/YARN-5109 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena >Priority: Blocker > Labels: yarn-2928-1st-milestone > > When we store timestamps (for example as part of the row key or part of the > column name for an event), the bytes are used as is without any encoding. If > the byte value happens to contain a separator character we use (e.g. "!" or > "="), it causes a parse failure when we read it. > I came across this while looking into this error in the timeline reader: > {noformat} > 2016-05-17 21:28:38,643 WARN > org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils: > incorrectly formatted column name: it will be discarded > {noformat} > I traced the data that was causing this, and the column name (for the event) > was the following: > {noformat} > i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST > {noformat} > Note that the column name is supposed to be of the format (event > id)=(timestamp)=(event info key). However, observe the timestamp portion: > {noformat} > \x7F\xFF\xFE\xABDY=\x99 > {noformat} > The presence of the separator ("=") causes the parse error. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors
[ https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15290166#comment-15290166 ] Sangjin Lee commented on YARN-5109: --- Here is a proposal. Instead of blindly splitting along the separator boundaries, we need to be able to tell the Separator to tokenize using the size also. The following is a quick prototype I put together to show the concept. This is an overloaded {{split()}} method: {code:title=TimelineStorageUtils.java|borderStyle=solid} public static final int NO_LIMIT = 0; public static List splitRanges(byte[] source, byte[] separator, int[] sizes) { List segments = new ArrayList(); if (source == null || separator == null || sizes == null) { return segments; } int start = 0; int i = 0; int k = 0; itersource: while (i < source.length && segments.size() < sizes.length) { int currentTokenSize = sizes[k]; if (currentTokenSize > NO_LIMIT) { // we explicitly grab a fixed number of bytes if (start + currentTokenSize > source.length) { // it's seeking beyond the source boundary throw new IllegalArgumentException("source is " + source.length + " bytes long and we're asking for " + (start + currentTokenSize)); } segments.add(new Range(start, start + currentTokenSize)); start += currentTokenSize; i += currentTokenSize; k++; // if there is more to parse, there must be a separator; strip it if (k <= sizes.length - 1) { for (int j = 0; j < separator.length; j++) { if (source[i + j] != separator[j]) { throw new IllegalArgumentException("separator is expected"); } continue; } // matched the separator start = i + separator.length; i += separator.length; } } else if (currentTokenSize == NO_LIMIT) { // use the separator // continue until we match the separator for (int j = 0; j < separator.length; j++) { if (source[i + j] != separator[j]) { i++; continue itersource; } // we just matched all separator elements segments.add(new Range(start, i)); start = i + separator.length; i += separator.length; k++; } } else { throw new IllegalArgumentException("negative size provided"); } } // add the final segment if (start < source.length && segments.size() < sizes.length) { // by deduction this can happen only if the token size = NO_LIMIT segments.add(new Range(start, source.length)); } return segments; } {code} You can basically instruct the utility if you expect certain size tokens when it parses the bytes. Value = 0 indicates the existing parsing behavior (parse until you hit the separator). Positive values indicate the number of bytes to grab whether or not there is a separator in the middle. For example, if we expect the structure to be a string, a long (as bytes), and a string you can invoke it with \{0, Long.BYTES, 0\}. That way, we can dictate how the row keys and column name qualifiers should be parsed. > timestamps are stored unencoded causing parse errors > > > Key: YARN-5109 > URL: https://issues.apache.org/jira/browse/YARN-5109 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena >Priority: Blocker > Labels: yarn-2928-1st-milestone > > When we store timestamps (for example as part of the row key or part of the > column name for an event), the bytes are used as is without any encoding. If > the byte value happens to contain a separator character we use (e.g. "!" or > "="), it causes a parse failure when we read it. > I came across this while looking into this error in the timeline reader: > {noformat} > 2016-05-17 21:28:38,643 WARN > org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils: > incorrectly formatted column name: it will be discarded > {noformat} > I traced the data that was causing this, and the column name (for the event) > was the following: > {noformat} > i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST > {noformat} > Note that the column name is supposed to be of the format (event > id)=(timestamp)=(event info key). However, observe the timestamp portion: > {noformat} > \x7F\xFF\xFE\xABDY=\x99 > {noformat} > The presence of the separator ("=") causes the parse error. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe,
[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors
[ https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15289895#comment-15289895 ] Sangjin Lee commented on YARN-5109: --- I spoke with [~jrottinghuis] offline about this. Initially we were thinking that we should encode those characters even in the case of bytes (essentially creating {{Separator.joinEncoded(byte[]...)}} and removing the raw {{Separator.join()}} method), but we are realizing that won't work. The key here is that we not only need to handle those separator characters ("=", "!", etc.) but also *preserve the ordering*. For example, suppose we have two timestamps ({{ts1}} and {{ts2}}) where {{ts1 < ts2}}. And assume {{ts2}} has a separator character in it. If we blindly encoded the separator character, we could easily violate {{ts1 < ts2}} once they are written. This would break all sorts of things, including range scans. My proposal is this. In almost all of these cases, the structure of the data we're storing and parsing is known strongly, whether it is the row key or the column qualifier. The problem with the current parsing is it uses solely splitting by separator. We should use the full data structure it knows already to parse correctly. For example, if we know that the structure is (string)=(timestamp)=(string), we can parse the first string, and then take the next 8 bytes *without splitting again* as we know it's a timestamp anyway and convert it into the long number, and take the last token after that. We should be able to follow the same idea in all cases. Thoughts? [~varun_saxena], let me know if you'd like to take a stab at that idea, or I should. > timestamps are stored unencoded causing parse errors > > > Key: YARN-5109 > URL: https://issues.apache.org/jira/browse/YARN-5109 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena >Priority: Blocker > Labels: yarn-2928-1st-milestone > > When we store timestamps (for example as part of the row key or part of the > column name for an event), the bytes are used as is without any encoding. If > the byte value happens to contain a separator character we use (e.g. "!" or > "="), it causes a parse failure when we read it. > I came across this while looking into this error in the timeline reader: > {noformat} > 2016-05-17 21:28:38,643 WARN > org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils: > incorrectly formatted column name: it will be discarded > {noformat} > I traced the data that was causing this, and the column name (for the event) > was the following: > {noformat} > i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST > {noformat} > Note that the column name is supposed to be of the format (event > id)=(timestamp)=(event info key). However, observe the timestamp portion: > {noformat} > \x7F\xFF\xFE\xABDY=\x99 > {noformat} > The presence of the separator ("=") causes the parse error. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors
[ https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15289571#comment-15289571 ] Sangjin Lee commented on YARN-5109: --- These are the all the cases of "naked" joins: {noformat}org.apache.hadoop.yarn.server.timelineservice.storage.common.ColumnHelper.getColumnQualifier(byte[], byte[]) org.apache.hadoop.yarn.server.timelineservice.storage.common.ColumnHelper.getColumnQualifier(byte[], long) org.apache.hadoop.yarn.server.timelineservice.storage.common.ColumnHelper.getColumnQualifier(byte[], String) org.apache.hadoop.yarn.server.timelineservice.storage.common.ColumnHelper.getCompoundColumnQualifierBytes(String, byte[]...) org.apache.hadoop.yarn.server.timelineservice.storage.flow.FlowActivityRowKey.getRowKey(String, long, String, String) org.apache.hadoop.yarn.server.timelineservice.storage.entity.EntityRowKey.getRowKey(String, String, String, Long, String, String, String) org.apache.hadoop.yarn.server.timelineservice.storage.application.ApplicationRowKey.getRowKey(String, String, String, Long, String) org.apache.hadoop.yarn.server.timelineservice.storage.flow.FlowRunRowKey.getRowKey(String, String, String, Long) org.apache.hadoop.yarn.server.timelineservice.storage.apptoflow.AppToFlowRowKey.getRowKey(String, String) org.apache.hadoop.yarn.server.timelineservice.storage.flow.FlowActivityRowKey.getRowKeyPrefix(String, long) org.apache.hadoop.yarn.server.timelineservice.storage.entity.EntityRowKey.getRowKeyPrefix(String, String, String, Long, String, String) org.apache.hadoop.yarn.server.timelineservice.storage.entity.EntityRowKey.getRowKeyPrefix(String, String, String, Long, String) org.apache.hadoop.yarn.server.timelineservice.storage.application.ApplicationRowKey.getRowKeyPrefix(String, String, String, Long) org.apache.hadoop.yarn.server.timelineservice.storage.application.ApplicationRowKey.getRowKeyPrefix(String, String, String) {noformat} > timestamps are stored unencoded causing parse errors > > > Key: YARN-5109 > URL: https://issues.apache.org/jira/browse/YARN-5109 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena >Priority: Blocker > Labels: yarn-2928-1st-milestone > > When we store timestamps (for example as part of the row key or part of the > column name for an event), the bytes are used as is without any encoding. If > the byte value happens to contain a separator character we use (e.g. "!" or > "="), it causes a parse failure when we read it. > I came across this while looking into this error in the timeline reader: > {noformat} > 2016-05-17 21:28:38,643 WARN > org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils: > incorrectly formatted column name: it will be discarded > {noformat} > I traced the data that was causing this, and the column name (for the event) > was the following: > {noformat} > i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST > {noformat} > Note that the column name is supposed to be of the format (event > id)=(timestamp)=(event info key). However, observe the timestamp portion: > {noformat} > \x7F\xFF\xFE\xABDY=\x99 > {noformat} > The presence of the separator ("=") causes the parse error. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors
[ https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15289237#comment-15289237 ] Varun Saxena commented on YARN-5109: Thanks [~sjlee0]. Yes, I am in the process of coding for this i.e. part regarding column qualifiers. Have to look at row keys too. I had written a UT too to simulate the failure before assigning JIRA to myself. Will have a patch by tomorrow. You are correct that we need to encode in getColumnQualifier too because while reading it back we do a split upto 2 segments for non compound column qualifiers(for fetching info). > timestamps are stored unencoded causing parse errors > > > Key: YARN-5109 > URL: https://issues.apache.org/jira/browse/YARN-5109 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena >Priority: Blocker > Labels: yarn-2928-1st-milestone > > When we store timestamps (for example as part of the row key or part of the > column name for an event), the bytes are used as is without any encoding. If > the byte value happens to contain a separator character we use (e.g. "!" or > "="), it causes a parse failure when we read it. > I came across this while looking into this error in the timeline reader: > {noformat} > 2016-05-17 21:28:38,643 WARN > org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils: > incorrectly formatted column name: it will be discarded > {noformat} > I traced the data that was causing this, and the column name (for the event) > was the following: > {noformat} > i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST > {noformat} > Note that the column name is supposed to be of the format (event > id)=(timestamp)=(event info key). However, observe the timestamp portion: > {noformat} > \x7F\xFF\xFE\xABDY=\x99 > {noformat} > The presence of the separator ("=") causes the parse error. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors
[ https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15289207#comment-15289207 ] Sangjin Lee commented on YARN-5109: --- Thanks for looking into this [~varun_saxena]. It might be good to write a small unit test for this. FYI, the offending timestamp in the above example is 1463437148774. I think the scope of this issue is rather big unfortunately. If I'm not mistaken, any time a byte array is passed into {{Separator.join()}} without encoding we need to encode it. Of course it also means we need to decode it on the way out. For example, {{ColumnHelper.getColumnQualifier()}} accepts both {{columnPrefixBytes}} and {{qualifier}} as is and joins them. I can work with you to identify all the places where this is a problem and also the solution. Let's discuss it here on the JIRA. > timestamps are stored unencoded causing parse errors > > > Key: YARN-5109 > URL: https://issues.apache.org/jira/browse/YARN-5109 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena >Priority: Blocker > Labels: yarn-2928-1st-milestone > > When we store timestamps (for example as part of the row key or part of the > column name for an event), the bytes are used as is without any encoding. If > the byte value happens to contain a separator character we use (e.g. "!" or > "="), it causes a parse failure when we read it. > I came across this while looking into this error in the timeline reader: > {noformat} > 2016-05-17 21:28:38,643 WARN > org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils: > incorrectly formatted column name: it will be discarded > {noformat} > I traced the data that was causing this, and the column name (for the event) > was the following: > {noformat} > i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST > {noformat} > Note that the column name is supposed to be of the format (event > id)=(timestamp)=(event info key). However, observe the timestamp portion: > {noformat} > \x7F\xFF\xFE\xABDY=\x99 > {noformat} > The presence of the separator ("=") causes the parse error. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors
[ https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15288444#comment-15288444 ] Sangjin Lee commented on YARN-5109: --- For example, code that creates the column name for events: {code} byte[] eventTs = Bytes.toBytes(TimelineStorageUtils.invertLong(eventTimestamp)); ... byte[] compoundColumnQualifierBytes = EntityColumnPrefix.EVENT. getCompoundColQualBytes(eventId, eventTs, null); ... public static byte[] getCompoundColumnQualifierBytes(String qualifier, byte[]...components) { byte[] colQualBytes = Bytes.toBytes(Separator.VALUES.encode(qualifier)); for (int i = 0; i < components.length; i++) { colQualBytes = Separator.VALUES.join(colQualBytes, components[i]); } return colQualBytes; } {code} The {{getCompoundColumnQualifierBytes()}} method uses the bytes from the timestamp as is without any encoding for VALUES ({{\x3d}}). I believe a similar issue exists with row keys. In most cases, long's are passed to the row key without any encoding for QUALIFIERS. If any of the byte values happens to be QUALIFIERS ({{\x21}}), it will cause the row key parsing to fail. > timestamps are stored unencoded causing parse errors > > > Key: YARN-5109 > URL: https://issues.apache.org/jira/browse/YARN-5109 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Priority: Blocker > Labels: yarn-2928-1st-milestone > > When we store timestamps (for example as part of the row key or part of the > column name for an event), the bytes are used as is without any encoding. If > the byte value happens to contain a separator character we use (e.g. "!" or > "="), it causes a parse failure when we read it. > I came across this while looking into this error in the timeline reader: > {noformat} > 2016-05-17 21:28:38,643 WARN > org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils: > incorrectly formatted column name: it will be discarded > {noformat} > I traced the data that was causing this, and the column name (for the event) > was the following: > {noformat} > i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST > {noformat} > Note that the column name is supposed to be of the format (event > id)=(timestamp)=(event info key). However, observe the timestamp portion: > {noformat} > \x7F\xFF\xFE\xABDY=\x99 > {noformat} > The presence of the separator ("=") causes the parse error. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org