[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors

2016-07-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15369751#comment-15369751
 ] 

Hudson commented on YARN-5109:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #10074 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/10074/])
YARN-5109. timestamps are stored unencoded causing parse errors (Varun (sjlee: 
rev 7b8cfa5c2ff62005c8b78867fedd64b48e50383d)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/common/AppIdKeyConverter.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/test/java/org/apache/hadoop/yarn/server/timelineservice/storage/common/TestKeyConverters.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/flow/FlowActivityRowKeyConverter.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/flow/FlowRunRowKeyConverter.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests/src/test/java/org/apache/hadoop/yarn/server/timelineservice/storage/flow/TestHBaseStorageFlowActivity.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/reader/TimelineEntityReader.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/common/Separator.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/flow/FlowRunRowKey.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/apptoflow/AppToFlowRowKey.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/common/StringKeyConverter.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/flow/FlowActivityRowKey.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/flow/FlowRunColumnPrefix.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/reader/FlowActivityEntityReader.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests/src/test/java/org/apache/hadoop/yarn/server/timelineservice/storage/TestHBaseTimelineStorage.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/common/KeyConverter.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/apptoflow/AppToFlowRowKeyConverter.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/application/ApplicationRowKey.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/application/ApplicationColumnPrefix.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/common/EventColumnName.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/entity/EntityRowKeyConverter.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/test/java/org/apache/hadoop/yarn/server/timelineservice/storage/common/TestTimelineStorageUtils.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/entity/EntityColumnPrefix.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/common/TimelineStorageUtils.java
* 

[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors

2016-05-27 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15304659#comment-15304659
 ] 

Varun Saxena commented on YARN-5109:


Thanks [~sjlee0] and [~jrottinghuis] for the review and commit. And for 
suggesting the approach taken in the JIRA.

> timestamps are stored unencoded causing parse errors
> 
>
> Key: YARN-5109
> URL: https://issues.apache.org/jira/browse/YARN-5109
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
>Priority: Blocker
>  Labels: yarn-2928-1st-milestone
> Fix For: YARN-2928
>
> Attachments: YARN-5109-YARN-2928.003.patch, 
> YARN-5109-YARN-2928.01.patch, YARN-5109-YARN-2928.02.patch, 
> YARN-5109-YARN-2928.03.patch, YARN-5109-YARN-2928.04.patch, 
> YARN-5109-YARN-2928.05.patch, YARN-5109-YARN-2928.06.patch, 
> YARN-5109-YARN-2928.07.patch, YARN-5109-YARN-2928.08.patch
>
>
> When we store timestamps (for example as part of the row key or part of the 
> column name for an event), the bytes are used as is without any encoding. If 
> the byte value happens to contain a separator character we use (e.g. "!" or 
> "="), it causes a parse failure when we read it.
> I came across this while looking into this error in the timeline reader:
> {noformat}
> 2016-05-17 21:28:38,643 WARN 
> org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils:
>  incorrectly formatted column name: it will be discarded
> {noformat}
> I traced the data that was causing this, and the column name (for the event) 
> was the following:
> {noformat}
> i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST
> {noformat}
> Note that the column name is supposed to be of the format (event 
> id)=(timestamp)=(event info key). However, observe the timestamp portion:
> {noformat}
> \x7F\xFF\xFE\xABDY=\x99
> {noformat}
> The presence of the separator ("=") causes the parse error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors

2016-05-26 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15303236#comment-15303236
 ] 

Sangjin Lee commented on YARN-5109:
---

I am also +1 on the latest patch. I'll wait until the EOD for last minute 
comments before I commit it.

> timestamps are stored unencoded causing parse errors
> 
>
> Key: YARN-5109
> URL: https://issues.apache.org/jira/browse/YARN-5109
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
>Priority: Blocker
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-5109-YARN-2928.003.patch, 
> YARN-5109-YARN-2928.01.patch, YARN-5109-YARN-2928.02.patch, 
> YARN-5109-YARN-2928.03.patch, YARN-5109-YARN-2928.04.patch, 
> YARN-5109-YARN-2928.05.patch, YARN-5109-YARN-2928.06.patch, 
> YARN-5109-YARN-2928.07.patch, YARN-5109-YARN-2928.08.patch
>
>
> When we store timestamps (for example as part of the row key or part of the 
> column name for an event), the bytes are used as is without any encoding. If 
> the byte value happens to contain a separator character we use (e.g. "!" or 
> "="), it causes a parse failure when we read it.
> I came across this while looking into this error in the timeline reader:
> {noformat}
> 2016-05-17 21:28:38,643 WARN 
> org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils:
>  incorrectly formatted column name: it will be discarded
> {noformat}
> I traced the data that was causing this, and the column name (for the event) 
> was the following:
> {noformat}
> i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST
> {noformat}
> Note that the column name is supposed to be of the format (event 
> id)=(timestamp)=(event info key). However, observe the timestamp portion:
> {noformat}
> \x7F\xFF\xFE\xABDY=\x99
> {noformat}
> The presence of the separator ("=") causes the parse error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors

2016-05-26 Thread Joep Rottinghuis (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15303184#comment-15303184
 ] 

Joep Rottinghuis commented on YARN-5109:


+1: 07 patch looks good to me. Thanks for all the patience in going back and 
forth on things [~varun_saxena], I think we ended up with a nice clean solution!

> timestamps are stored unencoded causing parse errors
> 
>
> Key: YARN-5109
> URL: https://issues.apache.org/jira/browse/YARN-5109
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
>Priority: Blocker
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-5109-YARN-2928.003.patch, 
> YARN-5109-YARN-2928.01.patch, YARN-5109-YARN-2928.02.patch, 
> YARN-5109-YARN-2928.03.patch, YARN-5109-YARN-2928.04.patch, 
> YARN-5109-YARN-2928.05.patch, YARN-5109-YARN-2928.06.patch, 
> YARN-5109-YARN-2928.07.patch, YARN-5109-YARN-2928.08.patch
>
>
> When we store timestamps (for example as part of the row key or part of the 
> column name for an event), the bytes are used as is without any encoding. If 
> the byte value happens to contain a separator character we use (e.g. "!" or 
> "="), it causes a parse failure when we read it.
> I came across this while looking into this error in the timeline reader:
> {noformat}
> 2016-05-17 21:28:38,643 WARN 
> org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils:
>  incorrectly formatted column name: it will be discarded
> {noformat}
> I traced the data that was causing this, and the column name (for the event) 
> was the following:
> {noformat}
> i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST
> {noformat}
> Note that the column name is supposed to be of the format (event 
> id)=(timestamp)=(event info key). However, observe the timestamp portion:
> {noformat}
> \x7F\xFF\xFE\xABDY=\x99
> {noformat}
> The presence of the separator ("=") causes the parse error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors

2016-05-26 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15302969#comment-15302969
 ] 

Hadoop QA commented on YARN-5109:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 12s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 5 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 28s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
13s {color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 23s 
{color} | {color:green} YARN-2928 passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 31s 
{color} | {color:green} YARN-2928 passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
27s {color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 45s 
{color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
32s {color} | {color:green} YARN-2928 passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 36s 
{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests
 in YARN-2928 has 30 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s 
{color} | {color:green} YARN-2928 passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 30s 
{color} | {color:green} YARN-2928 passed with JDK v1.7.0_101 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
37s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 17s 
{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 17s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 31s 
{color} | {color:green} the patch passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 31s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
25s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 42s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
28s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
33s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 22s 
{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s 
{color} | {color:green} the patch passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 41s 
{color} | {color:green} hadoop-yarn-server-timelineservice in the patch passed 
with JDK v1.8.0_91. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 25s 
{color} | {color:green} hadoop-yarn-server-timelineservice-hbase-tests in the 
patch passed with JDK v1.8.0_91. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 47s 
{color} | {color:green} hadoop-yarn-server-timelineservice in the patch passed 
with JDK v1.7.0_101. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 40s 
{color} | {color:green} hadoop-yarn-server-timelineservice-hbase-tests in the 
patch passed with JDK v1.7.0_101. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | 

[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors

2016-05-26 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15302963#comment-15302963
 ] 

Varun Saxena commented on YARN-5109:


[~jrottinghuis], [~sjlee0], kindly review.
The build is clean. Findbugs issues are in timelineservice-hbase-tests and will 
be handled by YARN-5142.
The changes in latest patch over and above the 07 patch are related to adding 
TAB as a separator. And adding encoding/decoding for spaces and tabs in row 
keys.
Also added code for encoding tabs in column names.

> timestamps are stored unencoded causing parse errors
> 
>
> Key: YARN-5109
> URL: https://issues.apache.org/jira/browse/YARN-5109
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
>Priority: Blocker
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-5109-YARN-2928.003.patch, 
> YARN-5109-YARN-2928.01.patch, YARN-5109-YARN-2928.02.patch, 
> YARN-5109-YARN-2928.03.patch, YARN-5109-YARN-2928.04.patch, 
> YARN-5109-YARN-2928.05.patch, YARN-5109-YARN-2928.06.patch, 
> YARN-5109-YARN-2928.07.patch, YARN-5109-YARN-2928.08.patch
>
>
> When we store timestamps (for example as part of the row key or part of the 
> column name for an event), the bytes are used as is without any encoding. If 
> the byte value happens to contain a separator character we use (e.g. "!" or 
> "="), it causes a parse failure when we read it.
> I came across this while looking into this error in the timeline reader:
> {noformat}
> 2016-05-17 21:28:38,643 WARN 
> org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils:
>  incorrectly formatted column name: it will be discarded
> {noformat}
> I traced the data that was causing this, and the column name (for the event) 
> was the following:
> {noformat}
> i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST
> {noformat}
> Note that the column name is supposed to be of the format (event 
> id)=(timestamp)=(event info key). However, observe the timestamp portion:
> {noformat}
> \x7F\xFF\xFE\xABDY=\x99
> {noformat}
> The presence of the separator ("=") causes the parse error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors

2016-05-26 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15302958#comment-15302958
 ] 

Hadoop QA commented on YARN-5109:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 16s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 5 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 23s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
7s {color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 16s 
{color} | {color:green} YARN-2928 passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 32s 
{color} | {color:green} YARN-2928 passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
27s {color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 48s 
{color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
32s {color} | {color:green} YARN-2928 passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 36s 
{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests
 in YARN-2928 has 30 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s 
{color} | {color:green} YARN-2928 passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 30s 
{color} | {color:green} YARN-2928 passed with JDK v1.7.0_101 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
37s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 16s 
{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 16s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 30s 
{color} | {color:green} the patch passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 30s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
25s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 42s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
28s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
31s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 21s 
{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s 
{color} | {color:green} the patch passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 40s 
{color} | {color:green} hadoop-yarn-server-timelineservice in the patch passed 
with JDK v1.8.0_91. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 50s 
{color} | {color:green} hadoop-yarn-server-timelineservice-hbase-tests in the 
patch passed with JDK v1.8.0_91. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 47s 
{color} | {color:green} hadoop-yarn-server-timelineservice in the patch passed 
with JDK v1.7.0_101. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 4m 8s 
{color} | {color:green} hadoop-yarn-server-timelineservice-hbase-tests in the 
patch passed with JDK v1.7.0_101. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | 

[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors

2016-05-26 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15302764#comment-15302764
 ] 

Varun Saxena commented on YARN-5109:


[~jrottinghuis], I am currently writing code for encoding tabs and spaces in 
row keys and column names. Will update patch shortly.

> timestamps are stored unencoded causing parse errors
> 
>
> Key: YARN-5109
> URL: https://issues.apache.org/jira/browse/YARN-5109
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
>Priority: Blocker
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-5109-YARN-2928.003.patch, 
> YARN-5109-YARN-2928.01.patch, YARN-5109-YARN-2928.02.patch, 
> YARN-5109-YARN-2928.03.patch, YARN-5109-YARN-2928.04.patch, 
> YARN-5109-YARN-2928.05.patch, YARN-5109-YARN-2928.06.patch, 
> YARN-5109-YARN-2928.07.patch
>
>
> When we store timestamps (for example as part of the row key or part of the 
> column name for an event), the bytes are used as is without any encoding. If 
> the byte value happens to contain a separator character we use (e.g. "!" or 
> "="), it causes a parse failure when we read it.
> I came across this while looking into this error in the timeline reader:
> {noformat}
> 2016-05-17 21:28:38,643 WARN 
> org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils:
>  incorrectly formatted column name: it will be discarded
> {noformat}
> I traced the data that was causing this, and the column name (for the event) 
> was the following:
> {noformat}
> i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST
> {noformat}
> Note that the column name is supposed to be of the format (event 
> id)=(timestamp)=(event info key). However, observe the timestamp portion:
> {noformat}
> \x7F\xFF\xFE\xABDY=\x99
> {noformat}
> The presence of the separator ("=") causes the parse error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors

2016-05-26 Thread Joep Rottinghuis (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15302736#comment-15302736
 ] 

Joep Rottinghuis commented on YARN-5109:


[~varun_saxena] let's leave that replace alone for right now. I'm about to file 
a separate jira with another (related issue) which would change that code 
anyway. Let's see if we can nail down this patch and get it in.
So, String#replace and StringUtills#replace aside, is 
YARN-5109-YARN-2928.07.patch the patch to look at, or are there more open 
issues pending?

> timestamps are stored unencoded causing parse errors
> 
>
> Key: YARN-5109
> URL: https://issues.apache.org/jira/browse/YARN-5109
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
>Priority: Blocker
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-5109-YARN-2928.003.patch, 
> YARN-5109-YARN-2928.01.patch, YARN-5109-YARN-2928.02.patch, 
> YARN-5109-YARN-2928.03.patch, YARN-5109-YARN-2928.04.patch, 
> YARN-5109-YARN-2928.05.patch, YARN-5109-YARN-2928.06.patch, 
> YARN-5109-YARN-2928.07.patch
>
>
> When we store timestamps (for example as part of the row key or part of the 
> column name for an event), the bytes are used as is without any encoding. If 
> the byte value happens to contain a separator character we use (e.g. "!" or 
> "="), it causes a parse failure when we read it.
> I came across this while looking into this error in the timeline reader:
> {noformat}
> 2016-05-17 21:28:38,643 WARN 
> org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils:
>  incorrectly formatted column name: it will be discarded
> {noformat}
> I traced the data that was causing this, and the column name (for the event) 
> was the following:
> {noformat}
> i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST
> {noformat}
> Note that the column name is supposed to be of the format (event 
> id)=(timestamp)=(event info key). However, observe the timestamp portion:
> {noformat}
> \x7F\xFF\xFE\xABDY=\x99
> {noformat}
> The presence of the separator ("=") causes the parse error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors

2016-05-26 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15302672#comment-15302672
 ] 

Varun Saxena commented on YARN-5109:


[~sjlee0], [~jrottinghuis]
In Separator#encode, we are using String#replace which in turn uses Pattern. 
Why dont we use StringUtils#replace instead ?
I think former would be slower. 
Thoughts ? 

> timestamps are stored unencoded causing parse errors
> 
>
> Key: YARN-5109
> URL: https://issues.apache.org/jira/browse/YARN-5109
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
>Priority: Blocker
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-5109-YARN-2928.003.patch, 
> YARN-5109-YARN-2928.01.patch, YARN-5109-YARN-2928.02.patch, 
> YARN-5109-YARN-2928.03.patch, YARN-5109-YARN-2928.04.patch, 
> YARN-5109-YARN-2928.05.patch, YARN-5109-YARN-2928.06.patch, 
> YARN-5109-YARN-2928.07.patch
>
>
> When we store timestamps (for example as part of the row key or part of the 
> column name for an event), the bytes are used as is without any encoding. If 
> the byte value happens to contain a separator character we use (e.g. "!" or 
> "="), it causes a parse failure when we read it.
> I came across this while looking into this error in the timeline reader:
> {noformat}
> 2016-05-17 21:28:38,643 WARN 
> org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils:
>  incorrectly formatted column name: it will be discarded
> {noformat}
> I traced the data that was causing this, and the column name (for the event) 
> was the following:
> {noformat}
> i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST
> {noformat}
> Note that the column name is supposed to be of the format (event 
> id)=(timestamp)=(event info key). However, observe the timestamp portion:
> {noformat}
> \x7F\xFF\xFE\xABDY=\x99
> {noformat}
> The presence of the separator ("=") causes the parse error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors

2016-05-26 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15302651#comment-15302651
 ] 

Varun Saxena commented on YARN-5109:


testWriteNullApplicationToHBase was failing due to the test case itself. It did 
not show up earlier because of the sequence of tests being run I guess.
We were setting Scan#setStartRow in this test but setting a stop row which 
meant a row inserted in the new test I added was being picked up.
Will fix the test case in this JIRA itself.


> timestamps are stored unencoded causing parse errors
> 
>
> Key: YARN-5109
> URL: https://issues.apache.org/jira/browse/YARN-5109
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
>Priority: Blocker
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-5109-YARN-2928.003.patch, 
> YARN-5109-YARN-2928.01.patch, YARN-5109-YARN-2928.02.patch, 
> YARN-5109-YARN-2928.03.patch, YARN-5109-YARN-2928.04.patch, 
> YARN-5109-YARN-2928.05.patch, YARN-5109-YARN-2928.06.patch, 
> YARN-5109-YARN-2928.07.patch
>
>
> When we store timestamps (for example as part of the row key or part of the 
> column name for an event), the bytes are used as is without any encoding. If 
> the byte value happens to contain a separator character we use (e.g. "!" or 
> "="), it causes a parse failure when we read it.
> I came across this while looking into this error in the timeline reader:
> {noformat}
> 2016-05-17 21:28:38,643 WARN 
> org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils:
>  incorrectly formatted column name: it will be discarded
> {noformat}
> I traced the data that was causing this, and the column name (for the event) 
> was the following:
> {noformat}
> i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST
> {noformat}
> Note that the column name is supposed to be of the format (event 
> id)=(timestamp)=(event info key). However, observe the timestamp portion:
> {noformat}
> \x7F\xFF\xFE\xABDY=\x99
> {noformat}
> The presence of the separator ("=") causes the parse error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors

2016-05-26 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15302548#comment-15302548
 ] 

Hadoop QA commented on YARN-5109:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 18s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 5 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 55s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 
53s {color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 49s 
{color} | {color:green} YARN-2928 passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 53s 
{color} | {color:green} YARN-2928 passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
29s {color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 52s 
{color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
35s {color} | {color:green} YARN-2928 passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 13s 
{color} | {color:red} 
branch/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests
 no findbugs output file 
(hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests/target/findbugsXml.xml)
 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 33s 
{color} | {color:green} YARN-2928 passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 40s 
{color} | {color:green} YARN-2928 passed with JDK v1.7.0_101 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 12s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 14s 
{color} | {color:red} hadoop-yarn-server-timelineservice-hbase-tests in the 
patch failed. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red} 1m 43s 
{color} | {color:red} hadoop-yarn-server in the patch failed with JDK 
v1.8.0_91. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 1m 43s {color} 
| {color:red} hadoop-yarn-server in the patch failed with JDK v1.8.0_91. 
{color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red} 1m 50s 
{color} | {color:red} hadoop-yarn-server in the patch failed with JDK 
v1.7.0_101. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 1m 50s {color} 
| {color:red} hadoop-yarn-server in the patch failed with JDK v1.7.0_101. 
{color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 35s 
{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server: patch 
generated 10 new + 2 unchanged - 0 fixed = 12 total (was 2) {color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red} 0m 18s 
{color} | {color:red} hadoop-yarn-server-timelineservice-hbase-tests in the 
patch failed. {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
30s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 14s 
{color} | {color:red} hadoop-yarn-server-timelineservice-hbase-tests in the 
patch failed. {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 1m 37s 
{color} | {color:red} 
hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-timelineservice-jdk1.8.0_91
 with JDK v1.8.0_91 generated 2 new + 0 unchanged - 0 fixed = 2 total (was 0) 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s 
{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 30s 
{color} | {color:green} the patch passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 57s 
{color} | {color:green} 

[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors

2016-05-26 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15302534#comment-15302534
 ] 

Varun Saxena commented on YARN-5109:


Yeah, hadnt rebased the branch. Will check.

> timestamps are stored unencoded causing parse errors
> 
>
> Key: YARN-5109
> URL: https://issues.apache.org/jira/browse/YARN-5109
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
>Priority: Blocker
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-5109-YARN-2928.003.patch, 
> YARN-5109-YARN-2928.01.patch, YARN-5109-YARN-2928.02.patch, 
> YARN-5109-YARN-2928.03.patch, YARN-5109-YARN-2928.04.patch, 
> YARN-5109-YARN-2928.05.patch, YARN-5109-YARN-2928.06.patch, 
> YARN-5109-YARN-2928.07.patch
>
>
> When we store timestamps (for example as part of the row key or part of the 
> column name for an event), the bytes are used as is without any encoding. If 
> the byte value happens to contain a separator character we use (e.g. "!" or 
> "="), it causes a parse failure when we read it.
> I came across this while looking into this error in the timeline reader:
> {noformat}
> 2016-05-17 21:28:38,643 WARN 
> org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils:
>  incorrectly formatted column name: it will be discarded
> {noformat}
> I traced the data that was causing this, and the column name (for the event) 
> was the following:
> {noformat}
> i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST
> {noformat}
> Note that the column name is supposed to be of the format (event 
> id)=(timestamp)=(event info key). However, observe the timestamp portion:
> {noformat}
> \x7F\xFF\xFE\xABDY=\x99
> {noformat}
> The presence of the separator ("=") causes the parse error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors

2016-05-26 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15302528#comment-15302528
 ] 

Sangjin Lee commented on YARN-5109:
---

Also, one of the unit tests is failing. Haven't checked again if the issue 
exists in the branch itself, but I have been running the tests regularly and 
haven't seen this, so it might be introduced by the patch.

{noformat}
testWriteNullApplicationToHBase(org.apache.hadoop.yarn.server.timelineservice.storage.TestHBaseTimelineStorage)
  Time elapsed: 0.09 sec  <<< FAILURE!
java.lang.AssertionError: expected:<0> but was:<1>
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:743)
at org.junit.Assert.assertEquals(Assert.java:118)
at org.junit.Assert.assertEquals(Assert.java:555)
at org.junit.Assert.assertEquals(Assert.java:542)
at 
org.apache.hadoop.yarn.server.timelineservice.storage.TestHBaseTimelineStorage.testWriteNullApplicationToHBase(TestHBaseTimelineStorage.java:536)
{noformat}

> timestamps are stored unencoded causing parse errors
> 
>
> Key: YARN-5109
> URL: https://issues.apache.org/jira/browse/YARN-5109
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
>Priority: Blocker
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-5109-YARN-2928.003.patch, 
> YARN-5109-YARN-2928.01.patch, YARN-5109-YARN-2928.02.patch, 
> YARN-5109-YARN-2928.03.patch, YARN-5109-YARN-2928.04.patch, 
> YARN-5109-YARN-2928.05.patch, YARN-5109-YARN-2928.06.patch, 
> YARN-5109-YARN-2928.07.patch
>
>
> When we store timestamps (for example as part of the row key or part of the 
> column name for an event), the bytes are used as is without any encoding. If 
> the byte value happens to contain a separator character we use (e.g. "!" or 
> "="), it causes a parse failure when we read it.
> I came across this while looking into this error in the timeline reader:
> {noformat}
> 2016-05-17 21:28:38,643 WARN 
> org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils:
>  incorrectly formatted column name: it will be discarded
> {noformat}
> I traced the data that was causing this, and the column name (for the event) 
> was the following:
> {noformat}
> i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST
> {noformat}
> Note that the column name is supposed to be of the format (event 
> id)=(timestamp)=(event info key). However, observe the timestamp portion:
> {noformat}
> \x7F\xFF\xFE\xABDY=\x99
> {noformat}
> The presence of the separator ("=") causes the parse error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors

2016-05-26 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15302520#comment-15302520
 ] 

Varun Saxena commented on YARN-5109:


Ohh...I hadn't updated my branch. Let me fix this and update.

> timestamps are stored unencoded causing parse errors
> 
>
> Key: YARN-5109
> URL: https://issues.apache.org/jira/browse/YARN-5109
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
>Priority: Blocker
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-5109-YARN-2928.003.patch, 
> YARN-5109-YARN-2928.01.patch, YARN-5109-YARN-2928.02.patch, 
> YARN-5109-YARN-2928.03.patch, YARN-5109-YARN-2928.04.patch, 
> YARN-5109-YARN-2928.05.patch, YARN-5109-YARN-2928.06.patch, 
> YARN-5109-YARN-2928.07.patch
>
>
> When we store timestamps (for example as part of the row key or part of the 
> column name for an event), the bytes are used as is without any encoding. If 
> the byte value happens to contain a separator character we use (e.g. "!" or 
> "="), it causes a parse failure when we read it.
> I came across this while looking into this error in the timeline reader:
> {noformat}
> 2016-05-17 21:28:38,643 WARN 
> org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils:
>  incorrectly formatted column name: it will be discarded
> {noformat}
> I traced the data that was causing this, and the column name (for the event) 
> was the following:
> {noformat}
> i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST
> {noformat}
> Note that the column name is supposed to be of the format (event 
> id)=(timestamp)=(event info key). However, observe the timestamp portion:
> {noformat}
> \x7F\xFF\xFE\xABDY=\x99
> {noformat}
> The presence of the separator ("=") causes the parse error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors

2016-05-26 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15302510#comment-15302510
 ] 

Sangjin Lee commented on YARN-5109:
---

Thanks for updating the patch [~varun_saxena]! I'll go over it one more time. 
FYI, it appears it doesn't compile cleanly? 

{noformat}
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-compiler-plugin:3.1:testCompile 
(default-testCompile) on project 
hadoop-yarn-server-timelineservice-hbase-tests: Compilation failure
[ERROR] 
/Users/sjlee/git/hadoop-ats/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests/src/test/java/org/apache/hadoop/yarn/server/timelineservice/storage/TestHBaseTimelineStorage.java:[521,24]
 cannot find symbol
[ERROR] symbol:   variable Bytes
[ERROR] location: class 
org.apache.hadoop.yarn.server.timelineservice.storage.TestHBaseTimelineStorage
{noformat}

Missing import?

> timestamps are stored unencoded causing parse errors
> 
>
> Key: YARN-5109
> URL: https://issues.apache.org/jira/browse/YARN-5109
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
>Priority: Blocker
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-5109-YARN-2928.003.patch, 
> YARN-5109-YARN-2928.01.patch, YARN-5109-YARN-2928.02.patch, 
> YARN-5109-YARN-2928.03.patch, YARN-5109-YARN-2928.04.patch, 
> YARN-5109-YARN-2928.05.patch, YARN-5109-YARN-2928.06.patch, 
> YARN-5109-YARN-2928.07.patch
>
>
> When we store timestamps (for example as part of the row key or part of the 
> column name for an event), the bytes are used as is without any encoding. If 
> the byte value happens to contain a separator character we use (e.g. "!" or 
> "="), it causes a parse failure when we read it.
> I came across this while looking into this error in the timeline reader:
> {noformat}
> 2016-05-17 21:28:38,643 WARN 
> org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils:
>  incorrectly formatted column name: it will be discarded
> {noformat}
> I traced the data that was causing this, and the column name (for the event) 
> was the following:
> {noformat}
> i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST
> {noformat}
> Note that the column name is supposed to be of the format (event 
> id)=(timestamp)=(event info key). However, observe the timestamp portion:
> {noformat}
> \x7F\xFF\xFE\xABDY=\x99
> {noformat}
> The presence of the separator ("=") causes the parse error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors

2016-05-26 Thread Joep Rottinghuis (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15302345#comment-15302345
 ] 

Joep Rottinghuis commented on YARN-5109:


Indeed it is easy to do now with the way KeyConverter and Separator are 
written, and yeah I was ambiguous about whether we should encode.
After thinking about it a bit more I do think we should encode tabs as well. If 
we encode both we should ensure that we encode and decode in the same order.
Probably as a general rule we should encode/decode things that are specified by 
a user, especially those things that we can expect to see spaces (or tabs) in, 
but probably as a good practice any values that comes from a user that goes 
into a column qualifier.

> timestamps are stored unencoded causing parse errors
> 
>
> Key: YARN-5109
> URL: https://issues.apache.org/jira/browse/YARN-5109
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
>Priority: Blocker
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-5109-YARN-2928.003.patch, 
> YARN-5109-YARN-2928.01.patch, YARN-5109-YARN-2928.02.patch, 
> YARN-5109-YARN-2928.03.patch, YARN-5109-YARN-2928.04.patch, 
> YARN-5109-YARN-2928.05.patch, YARN-5109-YARN-2928.06.patch
>
>
> When we store timestamps (for example as part of the row key or part of the 
> column name for an event), the bytes are used as is without any encoding. If 
> the byte value happens to contain a separator character we use (e.g. "!" or 
> "="), it causes a parse failure when we read it.
> I came across this while looking into this error in the timeline reader:
> {noformat}
> 2016-05-17 21:28:38,643 WARN 
> org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils:
>  incorrectly formatted column name: it will be discarded
> {noformat}
> I traced the data that was causing this, and the column name (for the event) 
> was the following:
> {noformat}
> i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST
> {noformat}
> Note that the column name is supposed to be of the format (event 
> id)=(timestamp)=(event info key). However, observe the timestamp portion:
> {noformat}
> \x7F\xFF\xFE\xABDY=\x99
> {noformat}
> The presence of the separator ("=") causes the parse error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors

2016-05-26 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15301889#comment-15301889
 ] 

Varun Saxena commented on YARN-5109:


bq.  It would be relatively easy to encode (and decode) tabs in strings, which 
should just happen in one or two methods now right?
Not sure if I understood your comment. You mean support for tab encoding should 
be easy now with KeyConverter interface ? Or you want me to encode tabs too in 
column qualifiers. Because from the explanation above, that should be a problem 
too. Right ?
Also, I think we should also encode event id and event info key for spaces in 
EventColumnNameConverter.

> timestamps are stored unencoded causing parse errors
> 
>
> Key: YARN-5109
> URL: https://issues.apache.org/jira/browse/YARN-5109
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
>Priority: Blocker
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-5109-YARN-2928.003.patch, 
> YARN-5109-YARN-2928.01.patch, YARN-5109-YARN-2928.02.patch, 
> YARN-5109-YARN-2928.03.patch, YARN-5109-YARN-2928.04.patch, 
> YARN-5109-YARN-2928.05.patch, YARN-5109-YARN-2928.06.patch
>
>
> When we store timestamps (for example as part of the row key or part of the 
> column name for an event), the bytes are used as is without any encoding. If 
> the byte value happens to contain a separator character we use (e.g. "!" or 
> "="), it causes a parse failure when we read it.
> I came across this while looking into this error in the timeline reader:
> {noformat}
> 2016-05-17 21:28:38,643 WARN 
> org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils:
>  incorrectly formatted column name: it will be discarded
> {noformat}
> I traced the data that was causing this, and the column name (for the event) 
> was the following:
> {noformat}
> i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST
> {noformat}
> Note that the column name is supposed to be of the format (event 
> id)=(timestamp)=(event info key). However, observe the timestamp portion:
> {noformat}
> \x7F\xFF\xFE\xABDY=\x99
> {noformat}
> The presence of the separator ("=") causes the parse error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors

2016-05-25 Thread Joep Rottinghuis (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15301051#comment-15301051
 ] 

Joep Rottinghuis commented on YARN-5109:


[~varun_saxena] please go ahead with the patch. I was thinking about a couple 
of additional items, but that goes beyond this patch, so I'll file a separate 
jira and patch for those, so please go ahead with what you have and what we 
discussed.

Wrt. encoding a long, you're right, we should probably either have a 
LongKeyConverter and use that to cleanly go back and forth. Note that we do 
already have a LongConverter implementing a slightly different interface 
(NumericValueConverter). We could either add an additional interface to this, 
or create a new class and have the implementation delegated to the existing 
class.

bq. "By the way, we are encoding spaces in column qualifiers. Any reason why we 
would not want spaces in column qualifiers ? We are not using space as a 
separator."

Yeah, while in HBase you can technically use non-printable characters or pretty 
much any series of bytes as column qualifiers, when working with the data, and 
especially administering any of this through the HBase shell, using spaces (or 
even tabs) make our life really difficult. We don't really expect tabs in names 
that are sent, but spaces are common in application names.
Spaces are similarly inconvenient to deal with in rest style calls as well, but 
that is a slightly different matter. Tabs are further often used in mapreduce 
to separate keys and values, so that adds further headache. It would be 
relatively easy to encode (and decode) tabs in strings, which should just 
happen in one or two methods now right?

> timestamps are stored unencoded causing parse errors
> 
>
> Key: YARN-5109
> URL: https://issues.apache.org/jira/browse/YARN-5109
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
>Priority: Blocker
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-5109-YARN-2928.003.patch, 
> YARN-5109-YARN-2928.01.patch, YARN-5109-YARN-2928.02.patch, 
> YARN-5109-YARN-2928.03.patch, YARN-5109-YARN-2928.04.patch, 
> YARN-5109-YARN-2928.05.patch, YARN-5109-YARN-2928.06.patch
>
>
> When we store timestamps (for example as part of the row key or part of the 
> column name for an event), the bytes are used as is without any encoding. If 
> the byte value happens to contain a separator character we use (e.g. "!" or 
> "="), it causes a parse failure when we read it.
> I came across this while looking into this error in the timeline reader:
> {noformat}
> 2016-05-17 21:28:38,643 WARN 
> org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils:
>  incorrectly formatted column name: it will be discarded
> {noformat}
> I traced the data that was causing this, and the column name (for the event) 
> was the following:
> {noformat}
> i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST
> {noformat}
> Note that the column name is supposed to be of the format (event 
> id)=(timestamp)=(event info key). However, observe the timestamp portion:
> {noformat}
> \x7F\xFF\xFE\xABDY=\x99
> {noformat}
> The presence of the separator ("=") causes the parse error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors

2016-05-25 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15300646#comment-15300646
 ] 

Varun Saxena commented on YARN-5109:


By the way, we are encoding spaces in column qualifiers. Any reason why we 
would not want spaces in column qualifiers ? We are not using space as a 
separator.
Moreover, in event column name we are not encoding spaces for event id and 
event info components. Is it not required ?

> timestamps are stored unencoded causing parse errors
> 
>
> Key: YARN-5109
> URL: https://issues.apache.org/jira/browse/YARN-5109
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
>Priority: Blocker
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-5109-YARN-2928.003.patch, 
> YARN-5109-YARN-2928.01.patch, YARN-5109-YARN-2928.02.patch, 
> YARN-5109-YARN-2928.03.patch, YARN-5109-YARN-2928.04.patch, 
> YARN-5109-YARN-2928.05.patch, YARN-5109-YARN-2928.06.patch
>
>
> When we store timestamps (for example as part of the row key or part of the 
> column name for an event), the bytes are used as is without any encoding. If 
> the byte value happens to contain a separator character we use (e.g. "!" or 
> "="), it causes a parse failure when we read it.
> I came across this while looking into this error in the timeline reader:
> {noformat}
> 2016-05-17 21:28:38,643 WARN 
> org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils:
>  incorrectly formatted column name: it will be discarded
> {noformat}
> I traced the data that was causing this, and the column name (for the event) 
> was the following:
> {noformat}
> i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST
> {noformat}
> Note that the column name is supposed to be of the format (event 
> id)=(timestamp)=(event info key). However, observe the timestamp portion:
> {noformat}
> \x7F\xFF\xFE\xABDY=\x99
> {noformat}
> The presence of the separator ("=") causes the parse error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors

2016-05-25 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15300570#comment-15300570
 ] 

Varun Saxena commented on YARN-5109:


Regarding point 1, I had seen this while coding. GenericObjectMapper will 
basically generate a String representation of long. And for longs, space 
encoding wont be required. But to make it consistent, I can probably convert 
long to String instead of calling GenericObjectMapper and call 
StringKeyConverter.

For point 2, I think we can add a javadoc. Was thinking of it. But later 
slipped out of my mind.

Agree with point 3, can change this method as well.

If you are reviewing, I can probably wait for more comments and then update the 
patch.

> timestamps are stored unencoded causing parse errors
> 
>
> Key: YARN-5109
> URL: https://issues.apache.org/jira/browse/YARN-5109
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
>Priority: Blocker
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-5109-YARN-2928.003.patch, 
> YARN-5109-YARN-2928.01.patch, YARN-5109-YARN-2928.02.patch, 
> YARN-5109-YARN-2928.03.patch, YARN-5109-YARN-2928.04.patch, 
> YARN-5109-YARN-2928.05.patch, YARN-5109-YARN-2928.06.patch
>
>
> When we store timestamps (for example as part of the row key or part of the 
> column name for an event), the bytes are used as is without any encoding. If 
> the byte value happens to contain a separator character we use (e.g. "!" or 
> "="), it causes a parse failure when we read it.
> I came across this while looking into this error in the timeline reader:
> {noformat}
> 2016-05-17 21:28:38,643 WARN 
> org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils:
>  incorrectly formatted column name: it will be discarded
> {noformat}
> I traced the data that was causing this, and the column name (for the event) 
> was the following:
> {noformat}
> i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST
> {noformat}
> Note that the column name is supposed to be of the format (event 
> id)=(timestamp)=(event info key). However, observe the timestamp portion:
> {noformat}
> \x7F\xFF\xFE\xABDY=\x99
> {noformat}
> The presence of the separator ("=") causes the parse error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors

2016-05-25 Thread Joep Rottinghuis (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15299567#comment-15299567
 ] 

Joep Rottinghuis commented on YARN-5109:


Overall, the approach and implementation looks really clean and solid.
I'll have to look at little further tomorrow, but I have three comments:
1) While reviewing patch for all places where we're converting strings into 
qualifiers for rows and columns it struck me that in 

HBaseTimelineWriterImpl.storeInFlowActivityTable
We still use GenericObjectMapper 
to go from a String to a column qualifier.
See line 190 (after applying patch):
{code}
byte[] qualifier = GenericObjectMapper.write(flowRunId);
{code}

In TimelineEntityReader.parseEntity
line 140
we use the regular StringKeyConverter:
{code}
FlowActivityColumnPrefix.RUN_ID.readResults(result,
StringKeyConverter.getInstance());
for (Map.Entry e : runIdsMap.entrySet()) {
{code}

so presumably the above code should read:
{code}
 byte[] qualifier = StringKeyConverter.getInstance().encode(flowRunId);
{code}

2) Somehow encode() and decode() apis sound like they should be symmetric. For 
StringKeyConverter they are.
StringKeyConverter.getInstance.decode(encode(s)).equals.s
for any String s (haven't checked null, or "").
Similarly
{code}
Bytes.equals(b, StringKeyConverter.getInstance.encode(decode(b)) == true
{code}
for any byte[] b.
This isn't true for ApplicationRowKey etc.
I understand this is because we use the same encode to create a prefix and I 
understand that we normally expect all fields to be present when we create an 
ApplicationRowKey.

Somehow it would be nicer to create an applicationRowKey object and then 
perhaps have a validate method on it, or be able to create one as a prefix 
(perhaps with a separate constructor).
Right now I don't really have a consistent suggestion for this, other than the 
comment that this somehow seems "unpleasant".

If we do decide to keep this current approach (perhaps because alternatives are 
just as ugly or unpleasant) then we should at least document in the javadoc for 
the encode and decode methods that the invariant that one might expect from 
naming doesn't have to hold true for all implementations. We should also update 
the javadoc in the implementations from a simple @override to a generated 
javadoc referring to the interface method.
I'm not sure if this is Hadoop coding standards, but I've always found it a 
little silly to methods that implement an interface with @override. For example
{code}
  /*
   * (non-Javadoc)
   * 
   * @see
   * org.apache.hadoop.yarn.server.timelineservice.storage.common.KeyConverter
   * #encode(java.lang.Object)
   */
  public byte[] encode(EntityRowKey rowKey) {
{code}
makes more sense to me then
{code}
  @Override  
  public byte[] encode(EntityRowKey rowKey) {
{code}
but that is probably more a matter of taste...

3) Why don't we do to ColumnHelper.readResultsWithTimestamps what we did to 
readResults?
that would make the method signature:
{code}
 public  NavigableMap> readResultsWithTimestamps(
  Result result, byte[] columnPrefixBytes, KeyConverter keyConverter)
  throws IOException {
{code}
With equivalent changes up the call stack in 
ApplicationColumnPrefix.readResultsWithTimestamps, 
EntiyColumnPrefix,readResultsWithTimestamps, 
FlowActivityColumnPrefix.readResultsWithTimestamps, and 
FlowRunColumnPrefix.readResultsWithTimestamps and all their respective uses.

> timestamps are stored unencoded causing parse errors
> 
>
> Key: YARN-5109
> URL: https://issues.apache.org/jira/browse/YARN-5109
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
>Priority: Blocker
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-5109-YARN-2928.003.patch, 
> YARN-5109-YARN-2928.01.patch, YARN-5109-YARN-2928.02.patch, 
> YARN-5109-YARN-2928.03.patch, YARN-5109-YARN-2928.04.patch, 
> YARN-5109-YARN-2928.05.patch, YARN-5109-YARN-2928.06.patch
>
>
> When we store timestamps (for example as part of the row key or part of the 
> column name for an event), the bytes are used as is without any encoding. If 
> the byte value happens to contain a separator character we use (e.g. "!" or 
> "="), it causes a parse failure when we read it.
> I came across this while looking into this error in the timeline reader:
> {noformat}
> 2016-05-17 21:28:38,643 WARN 
> org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils:
>  incorrectly formatted column name: it will be discarded
> {noformat}
> I traced the data that was causing this, and the column name (for the event) 
> was the following:

[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors

2016-05-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15299023#comment-15299023
 ] 

Hadoop QA commented on YARN-5109:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 14s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 5 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 22s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
57s {color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 30s 
{color} | {color:green} YARN-2928 passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 39s 
{color} | {color:green} YARN-2928 passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
28s {color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 47s 
{color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
33s {color} | {color:green} YARN-2928 passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 13s 
{color} | {color:red} 
branch/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests
 no findbugs output file 
(hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests/target/findbugsXml.xml)
 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s 
{color} | {color:green} YARN-2928 passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 31s 
{color} | {color:green} YARN-2928 passed with JDK v1.7.0_101 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
38s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 28s 
{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 28s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 34s 
{color} | {color:green} the patch passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 34s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
26s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 42s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
29s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 11s 
{color} | {color:red} 
patch/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests
 no findbugs output file 
(hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests/target/findbugsXml.xml)
 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 21s 
{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 28s 
{color} | {color:green} the patch passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 45s 
{color} | {color:green} hadoop-yarn-server-timelineservice in the patch passed 
with JDK v1.8.0_91. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 38s 
{color} | {color:green} hadoop-yarn-server-timelineservice-hbase-tests in the 
patch passed with JDK v1.8.0_91. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 48s 
{color} | {color:green} 

[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors

2016-05-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15299022#comment-15299022
 ] 

Hadoop QA commented on YARN-5109:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 12m 11s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 5 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 2m 28s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 
48s {color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 20s 
{color} | {color:green} YARN-2928 passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 31s 
{color} | {color:green} YARN-2928 passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
31s {color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 48s 
{color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
38s {color} | {color:green} YARN-2928 passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 13s 
{color} | {color:red} 
branch/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests
 no findbugs output file 
(hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests/target/findbugsXml.xml)
 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s 
{color} | {color:green} YARN-2928 passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 31s 
{color} | {color:green} YARN-2928 passed with JDK v1.7.0_101 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
37s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 17s 
{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 17s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 30s 
{color} | {color:green} the patch passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 30s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
26s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 41s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
28s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 11s 
{color} | {color:red} 
patch/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests
 no findbugs output file 
(hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests/target/findbugsXml.xml)
 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 22s 
{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s 
{color} | {color:green} the patch passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 41s 
{color} | {color:green} hadoop-yarn-server-timelineservice in the patch passed 
with JDK v1.8.0_91. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 23s 
{color} | {color:green} hadoop-yarn-server-timelineservice-hbase-tests in the 
patch passed with JDK v1.8.0_91. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 47s 
{color} | {color:green} 

[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors

2016-05-24 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298851#comment-15298851
 ] 

Varun Saxena commented on YARN-5109:


Makes sense. Will make it private.

> timestamps are stored unencoded causing parse errors
> 
>
> Key: YARN-5109
> URL: https://issues.apache.org/jira/browse/YARN-5109
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
>Priority: Blocker
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-5109-YARN-2928.003.patch, 
> YARN-5109-YARN-2928.01.patch, YARN-5109-YARN-2928.02.patch, 
> YARN-5109-YARN-2928.03.patch, YARN-5109-YARN-2928.04.patch, 
> YARN-5109-YARN-2928.05.patch
>
>
> When we store timestamps (for example as part of the row key or part of the 
> column name for an event), the bytes are used as is without any encoding. If 
> the byte value happens to contain a separator character we use (e.g. "!" or 
> "="), it causes a parse failure when we read it.
> I came across this while looking into this error in the timeline reader:
> {noformat}
> 2016-05-17 21:28:38,643 WARN 
> org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils:
>  incorrectly formatted column name: it will be discarded
> {noformat}
> I traced the data that was causing this, and the column name (for the event) 
> was the following:
> {noformat}
> i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST
> {noformat}
> Note that the column name is supposed to be of the format (event 
> id)=(timestamp)=(event info key). However, observe the timestamp portion:
> {noformat}
> \x7F\xFF\xFE\xABDY=\x99
> {noformat}
> The presence of the separator ("=") causes the parse error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors

2016-05-24 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298433#comment-15298433
 ] 

Sangjin Lee commented on YARN-5109:
---

Thanks for updating the patch [~varun_saxena]! I think it's almost there.

Just one thing I noticed (which I should have noticed earlier) is that the 
*static* {{split()}} methods do not really need to be public as they are 
exclusively used by the instance {{split()}} methods. In fact, it might not be 
a good idea for them to be used with an arbitrary separator outside the ones we 
define here. Can we make the static {{split()}} methods all private?

> timestamps are stored unencoded causing parse errors
> 
>
> Key: YARN-5109
> URL: https://issues.apache.org/jira/browse/YARN-5109
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
>Priority: Blocker
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-5109-YARN-2928.003.patch, 
> YARN-5109-YARN-2928.01.patch, YARN-5109-YARN-2928.02.patch, 
> YARN-5109-YARN-2928.03.patch, YARN-5109-YARN-2928.04.patch, 
> YARN-5109-YARN-2928.05.patch
>
>
> When we store timestamps (for example as part of the row key or part of the 
> column name for an event), the bytes are used as is without any encoding. If 
> the byte value happens to contain a separator character we use (e.g. "!" or 
> "="), it causes a parse failure when we read it.
> I came across this while looking into this error in the timeline reader:
> {noformat}
> 2016-05-17 21:28:38,643 WARN 
> org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils:
>  incorrectly formatted column name: it will be discarded
> {noformat}
> I traced the data that was causing this, and the column name (for the event) 
> was the following:
> {noformat}
> i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST
> {noformat}
> Note that the column name is supposed to be of the format (event 
> id)=(timestamp)=(event info key). However, observe the timestamp portion:
> {noformat}
> \x7F\xFF\xFE\xABDY=\x99
> {noformat}
> The presence of the separator ("=") causes the parse error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors

2016-05-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298019#comment-15298019
 ] 

Hadoop QA commented on YARN-5109:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 16s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 5 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 24s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
20s {color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 20s 
{color} | {color:green} YARN-2928 passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 31s 
{color} | {color:green} YARN-2928 passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
28s {color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 47s 
{color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
32s {color} | {color:green} YARN-2928 passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 13s 
{color} | {color:red} 
branch/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests
 no findbugs output file 
(hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests/target/findbugsXml.xml)
 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s 
{color} | {color:green} YARN-2928 passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 30s 
{color} | {color:green} YARN-2928 passed with JDK v1.7.0_101 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
37s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 14s 
{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 14s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 30s 
{color} | {color:green} the patch passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 30s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
25s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 41s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
28s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 11s 
{color} | {color:red} 
patch/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests
 no findbugs output file 
(hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests/target/findbugsXml.xml)
 {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 13s 
{color} | {color:red} hadoop-yarn-server-timelineservice in the patch failed 
with JDK v1.8.0_91. {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 2m 5s 
{color} | {color:red} 
hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-timelineservice-jdk1.7.0_101
 with JDK v1.7.0_101 generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s 
{color} | {color:green} the patch passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 41s 
{color} | {color:green} hadoop-yarn-server-timelineservice in the patch passed 
with JDK v1.8.0_91. 

[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors

2016-05-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15297969#comment-15297969
 ] 

Hadoop QA commented on YARN-5109:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} docker {color} | {color:red} 6m 13s 
{color} | {color:red} Docker failed to build yetus/hadoop:cf2ee45. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12805841/YARN-5109-YARN-2928.04.patch
 |
| JIRA Issue | YARN-5109 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/11649/console |
| Powered by | Apache Yetus 0.2.0   http://yetus.apache.org |


This message was automatically generated.



> timestamps are stored unencoded causing parse errors
> 
>
> Key: YARN-5109
> URL: https://issues.apache.org/jira/browse/YARN-5109
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
>Priority: Blocker
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-5109-YARN-2928.003.patch, 
> YARN-5109-YARN-2928.01.patch, YARN-5109-YARN-2928.02.patch, 
> YARN-5109-YARN-2928.03.patch, YARN-5109-YARN-2928.04.patch
>
>
> When we store timestamps (for example as part of the row key or part of the 
> column name for an event), the bytes are used as is without any encoding. If 
> the byte value happens to contain a separator character we use (e.g. "!" or 
> "="), it causes a parse failure when we read it.
> I came across this while looking into this error in the timeline reader:
> {noformat}
> 2016-05-17 21:28:38,643 WARN 
> org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils:
>  incorrectly formatted column name: it will be discarded
> {noformat}
> I traced the data that was causing this, and the column name (for the event) 
> was the following:
> {noformat}
> i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST
> {noformat}
> Note that the column name is supposed to be of the format (event 
> id)=(timestamp)=(event info key). However, observe the timestamp portion:
> {noformat}
> \x7F\xFF\xFE\xABDY=\x99
> {noformat}
> The presence of the separator ("=") causes the parse error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors

2016-05-24 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15297844#comment-15297844
 ] 

Varun Saxena commented on YARN-5109:


bq. Also, do we have a test that tests an encoded long having a separator in 
it? After all, that's what caused us to uncover this issue.
Yes, we have. In TestKeyConverters, I am trying to create flow run id and 
cluster timestamp(in app id) in a manner that will have separators in it. Event 
column name issue is also simulated. Infact it takes care of the case if 
QUALIFIER changes in future as well. 
TestHBaseTimelineStorage#testEventsEscapeTs takes care of issue with event 
column name in an E2E test case.

bq. Should we replace "" with Separator.EMPTY_BYTES? That should be equivalent, 
right?
As such, its not completely equal. We are calling joinEncoded, which takes 
strings. If we call join, we will have to first encode the string. I anyways 
added a constant EMPTY_STRING in Separator and using it.

bq. I think NO_LIMIT_SPLIT and VARIABLE_SIZE are getting confusing. Since we're 
using VARIABLE_SIZE for the most part, can we remove NO_LIMIT_SPLIT
NO_LIMIT_SPLIT is meant for indicating there is no limit to number of splits 
returned. VARIABLE_SIZE is used to indicate that size of a segment in split is 
variable. Anyways we can say VARIABLE_SIZE means not a fixed number of splits 
as well.

Other issues have been fixed.




> timestamps are stored unencoded causing parse errors
> 
>
> Key: YARN-5109
> URL: https://issues.apache.org/jira/browse/YARN-5109
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
>Priority: Blocker
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-5109-YARN-2928.003.patch, 
> YARN-5109-YARN-2928.01.patch, YARN-5109-YARN-2928.02.patch, 
> YARN-5109-YARN-2928.03.patch
>
>
> When we store timestamps (for example as part of the row key or part of the 
> column name for an event), the bytes are used as is without any encoding. If 
> the byte value happens to contain a separator character we use (e.g. "!" or 
> "="), it causes a parse failure when we read it.
> I came across this while looking into this error in the timeline reader:
> {noformat}
> 2016-05-17 21:28:38,643 WARN 
> org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils:
>  incorrectly formatted column name: it will be discarded
> {noformat}
> I traced the data that was causing this, and the column name (for the event) 
> was the following:
> {noformat}
> i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST
> {noformat}
> Note that the column name is supposed to be of the format (event 
> id)=(timestamp)=(event info key). However, observe the timestamp portion:
> {noformat}
> \x7F\xFF\xFE\xABDY=\x99
> {noformat}
> The presence of the separator ("=") causes the parse error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors

2016-05-23 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15297099#comment-15297099
 ] 

Sangjin Lee commented on YARN-5109:
---

Thanks [~varun_saxena] for the patch! I think it's almost there. I have a few 
mostly minor comments. It would be great if you could address them.

In terms of the package placement of the key converter classes, should they be 
in the respective packages instead of all in common? For example, 
{{ApplicationRowKeyConverter}} is really used by the application table classes, 
so it would be more natural to have it in the application package, and so on. 
Thoughts?

Also, do we have a test that tests an encoded long having a separator in it? 
After all, that's what caused us to uncover this issue. :)

(AppIdKeyConverter.java)
- one small suggestion: how about adding a method {{getKeySize()}} to return 
the expected size of the key so that users of {{AppIdKeyConverter}} do not need 
to hard-code the size themselves?

(FlowActivityRowKeyConverter.java)
- l.55: Should we replace "" with {{Separator.EMPTY_BYTES}}? That should be 
equivalent, right?

(FlowRunRowKeyConverter.java)
- l.56: same as above

(EventColumnName.java)
- l.31: the {{super()}} call is superfluous; can we remove it?

(ColumnHelper.java)
- l.259: nit: typo ({{converteColumnKey}} -> {{converterColumnKey}} or?)

(Separator.java)
- l.71: I think {{NO_LIMIT_SPLIT}} and {{VARIABLE_SIZE}} are getting confusing. 
Since we're using {{VARIABLE_SIZE}} for the most part, can we remove 
{{NO_LIMIT_SPLIT}}?
- l.491: I think this now calls {{split(byte[], byte[], int[])}}, not 
{{split(byte[], byte[], int)}}, which is quite confusing. We should eliminate 
the ambiguity here, by explicitly calling with the right argument instead of 
just {{null}}.
- we should make {{splitRanges()}} private or package-private

> timestamps are stored unencoded causing parse errors
> 
>
> Key: YARN-5109
> URL: https://issues.apache.org/jira/browse/YARN-5109
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
>Priority: Blocker
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-5109-YARN-2928.003.patch, 
> YARN-5109-YARN-2928.01.patch, YARN-5109-YARN-2928.02.patch, 
> YARN-5109-YARN-2928.03.patch
>
>
> When we store timestamps (for example as part of the row key or part of the 
> column name for an event), the bytes are used as is without any encoding. If 
> the byte value happens to contain a separator character we use (e.g. "!" or 
> "="), it causes a parse failure when we read it.
> I came across this while looking into this error in the timeline reader:
> {noformat}
> 2016-05-17 21:28:38,643 WARN 
> org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils:
>  incorrectly formatted column name: it will be discarded
> {noformat}
> I traced the data that was causing this, and the column name (for the event) 
> was the following:
> {noformat}
> i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST
> {noformat}
> Note that the column name is supposed to be of the format (event 
> id)=(timestamp)=(event info key). However, observe the timestamp portion:
> {noformat}
> \x7F\xFF\xFE\xABDY=\x99
> {noformat}
> The presence of the separator ("=") causes the parse error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors

2016-05-21 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15295281#comment-15295281
 ] 

Hadoop QA commented on YARN-5109:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 14s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 5 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 25s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
5s {color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 15s 
{color} | {color:green} YARN-2928 passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 31s 
{color} | {color:green} YARN-2928 passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
28s {color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 46s 
{color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
33s {color} | {color:green} YARN-2928 passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 13s 
{color} | {color:red} 
branch/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests
 no findbugs output file 
(hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests/target/findbugsXml.xml)
 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s 
{color} | {color:green} YARN-2928 passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 31s 
{color} | {color:green} YARN-2928 passed with JDK v1.7.0_101 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 12s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
36s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 12s 
{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 12s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 30s 
{color} | {color:green} the patch passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 30s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
25s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 42s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
28s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 11s 
{color} | {color:red} 
patch/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests
 no findbugs output file 
(hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests/target/findbugsXml.xml)
 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 21s 
{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s 
{color} | {color:green} the patch passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 39s 
{color} | {color:green} hadoop-yarn-server-timelineservice in the patch passed 
with JDK v1.8.0_91. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 4m 0s 
{color} | {color:green} hadoop-yarn-server-timelineservice-hbase-tests in the 
patch passed with JDK v1.8.0_91. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 47s 
{color} | {color:green} 

[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors

2016-05-21 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15295257#comment-15295257
 ] 

Hadoop QA commented on YARN-5109:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 14s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 5 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 2m 39s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 
32s {color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 18s 
{color} | {color:green} YARN-2928 passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 32s 
{color} | {color:green} YARN-2928 passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
33s {color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 49s 
{color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
38s {color} | {color:green} YARN-2928 passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 13s 
{color} | {color:red} 
branch/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests
 no findbugs output file 
(hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests/target/findbugsXml.xml)
 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 29s 
{color} | {color:green} YARN-2928 passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 31s 
{color} | {color:green} YARN-2928 passed with JDK v1.7.0_101 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
36s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 13s 
{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 13s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 30s 
{color} | {color:green} the patch passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 30s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 25s 
{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server: patch 
generated 8 new + 2 unchanged - 0 fixed = 10 total (was 2) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 42s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
28s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 12s 
{color} | {color:red} 
patch/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests
 no findbugs output file 
(hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests/target/findbugsXml.xml)
 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 22s 
{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s 
{color} | {color:green} the patch passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 40s 
{color} | {color:green} hadoop-yarn-server-timelineservice in the patch passed 
with JDK v1.8.0_91. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 4m 1s 
{color} | {color:green} hadoop-yarn-server-timelineservice-hbase-tests in the 
patch passed with JDK v1.8.0_91. {color} |
| 

[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors

2016-05-21 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15295242#comment-15295242
 ] 

Varun Saxena commented on YARN-5109:


Consistently getting build failures on Jenkins with following error
{noformat}
ERROR: a previous rebase failed. Aborting it.
HEAD is now at d491ef0 YARN-3367. Replace starting a separate thread for post 
entity with event loop in TimelineClient (Naganarasimha G R via sjlee)
Switched to branch 'trunk'
Your branch is up-to-date with 'origin/trunk'.
Current branch trunk is up to date.
Switched to branch 'YARN-2928'
Your branch and 'origin/YARN-2928' have diverged,
and have 85 and 782 different commits each, respectively.
  (use "git pull" to merge the remote branch into yours)
First, rewinding head to replay your work on top of it...
Applying: YARN-3063. Bootstrapping TimelineServer next generation module. 
Contributed by Zhijie Shen.
Using index info to reconstruct a base tree...
M   hadoop-project/pom.xml
A   hadoop-yarn-project/CHANGES.txt
M   hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/pom.xml
:47: trailing whitespace.
Trunk - Unreleased 
warning: 1 line adds whitespace errors.
Falling back to patching base and 3-way merge...
Auto-merging hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/pom.xml
CONFLICT (content): Merge conflict in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/pom.xml
Auto-merging 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/pom.xml
CONFLICT (add/add): Merge conflict in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/pom.xml
CONFLICT (modify/delete): hadoop-yarn-project/CHANGES.txt deleted in HEAD and 
modified in YARN-3063. Bootstrapping TimelineServer next generation module. 
Contributed by Zhijie Shen.. Version YARN-3063. Bootstrapping TimelineServer 
next generation module. Contributed by Zhijie Shen. of 
hadoop-yarn-project/CHANGES.txt left in tree.
Auto-merging hadoop-project/pom.xml
CONFLICT (content): Merge conflict in hadoop-project/pom.xml
Failed to merge in the changes.
Patch failed at 0001 YARN-3063. Bootstrapping TimelineServer next generation 
module. Contributed by Zhijie Shen.
The copy of the patch that failed is found in:
   
/home/jenkins/jenkins-slave/workspace/PreCommit-YARN-Build/.git/rebase-apply/patch

When you have resolved this problem, run "git rebase --continue".
If you prefer to skip this patch, run "git rebase --skip" instead.
To check out the original branch and stop rebasing, run "git rebase --abort".

ERROR: git pull is failing
{noformat}

Submitted the JIRA manually twice, together so that another build can start 
before first submission fails. Then it works. It seems workspace on one of the 
machines has a problem. 

> timestamps are stored unencoded causing parse errors
> 
>
> Key: YARN-5109
> URL: https://issues.apache.org/jira/browse/YARN-5109
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
>Priority: Blocker
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-5109-YARN-2928.003.patch, 
> YARN-5109-YARN-2928.01.patch, YARN-5109-YARN-2928.02.patch, 
> YARN-5109-YARN-2928.03.patch
>
>
> When we store timestamps (for example as part of the row key or part of the 
> column name for an event), the bytes are used as is without any encoding. If 
> the byte value happens to contain a separator character we use (e.g. "!" or 
> "="), it causes a parse failure when we read it.
> I came across this while looking into this error in the timeline reader:
> {noformat}
> 2016-05-17 21:28:38,643 WARN 
> org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils:
>  incorrectly formatted column name: it will be discarded
> {noformat}
> I traced the data that was causing this, and the column name (for the event) 
> was the following:
> {noformat}
> i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST
> {noformat}
> Note that the column name is supposed to be of the format (event 
> id)=(timestamp)=(event info key). However, observe the timestamp portion:
> {noformat}
> \x7F\xFF\xFE\xABDY=\x99
> {noformat}
> The presence of the separator ("=") causes the parse error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors

2016-05-21 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15295203#comment-15295203
 ] 

Varun Saxena commented on YARN-5109:


Updating a patch trying to fix checkstyle issues.

> timestamps are stored unencoded causing parse errors
> 
>
> Key: YARN-5109
> URL: https://issues.apache.org/jira/browse/YARN-5109
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
>Priority: Blocker
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-5109-YARN-2928.003.patch, 
> YARN-5109-YARN-2928.01.patch, YARN-5109-YARN-2928.02.patch, 
> YARN-5109-YARN-2928.03.patch
>
>
> When we store timestamps (for example as part of the row key or part of the 
> column name for an event), the bytes are used as is without any encoding. If 
> the byte value happens to contain a separator character we use (e.g. "!" or 
> "="), it causes a parse failure when we read it.
> I came across this while looking into this error in the timeline reader:
> {noformat}
> 2016-05-17 21:28:38,643 WARN 
> org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils:
>  incorrectly formatted column name: it will be discarded
> {noformat}
> I traced the data that was causing this, and the column name (for the event) 
> was the following:
> {noformat}
> i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST
> {noformat}
> Note that the column name is supposed to be of the format (event 
> id)=(timestamp)=(event info key). However, observe the timestamp portion:
> {noformat}
> \x7F\xFF\xFE\xABDY=\x99
> {noformat}
> The presence of the separator ("=") causes the parse error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors

2016-05-21 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15295159#comment-15295159
 ] 

Hadoop QA commented on YARN-5109:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 18s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 5 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 57s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
12s {color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 22s 
{color} | {color:green} YARN-2928 passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 32s 
{color} | {color:green} YARN-2928 passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
29s {color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 46s 
{color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
33s {color} | {color:green} YARN-2928 passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 13s 
{color} | {color:red} 
branch/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests
 no findbugs output file 
(hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests/target/findbugsXml.xml)
 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s 
{color} | {color:green} YARN-2928 passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 31s 
{color} | {color:green} YARN-2928 passed with JDK v1.7.0_101 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
37s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 18s 
{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 18s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 29s 
{color} | {color:green} the patch passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 29s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 27s 
{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server: patch 
generated 7 new + 2 unchanged - 0 fixed = 9 total (was 2) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 41s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
28s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 11s 
{color} | {color:red} 
patch/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests
 no findbugs output file 
(hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests/target/findbugsXml.xml)
 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 21s 
{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s 
{color} | {color:green} the patch passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 41s 
{color} | {color:green} hadoop-yarn-server-timelineservice in the patch passed 
with JDK v1.8.0_91. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 27s 
{color} | {color:green} hadoop-yarn-server-timelineservice-hbase-tests in the 
patch passed with JDK v1.8.0_91. {color} |
| 

[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors

2016-05-20 Thread Joep Rottinghuis (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15294610#comment-15294610
 ] 

Joep Rottinghuis commented on YARN-5109:


Seems sensible. Looking forward to see in context on patch



> timestamps are stored unencoded causing parse errors
> 
>
> Key: YARN-5109
> URL: https://issues.apache.org/jira/browse/YARN-5109
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
>Priority: Blocker
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-5109-YARN-2928.01.patch, 
> YARN-5109-YARN-2928.02.patch
>
>
> When we store timestamps (for example as part of the row key or part of the 
> column name for an event), the bytes are used as is without any encoding. If 
> the byte value happens to contain a separator character we use (e.g. "!" or 
> "="), it causes a parse failure when we read it.
> I came across this while looking into this error in the timeline reader:
> {noformat}
> 2016-05-17 21:28:38,643 WARN 
> org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils:
>  incorrectly formatted column name: it will be discarded
> {noformat}
> I traced the data that was causing this, and the column name (for the event) 
> was the following:
> {noformat}
> i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST
> {noformat}
> Note that the column name is supposed to be of the format (event 
> id)=(timestamp)=(event info key). However, observe the timestamp portion:
> {noformat}
> \x7F\xFF\xFE\xABDY=\x99
> {noformat}
> The presence of the separator ("=") causes the parse error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors

2016-05-20 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15294603#comment-15294603
 ] 

Varun Saxena commented on YARN-5109:


[~sjlee0], yeah this doesnt break anything. I was just curious to know why the 
different approaches.
Above makes sense. 

> timestamps are stored unencoded causing parse errors
> 
>
> Key: YARN-5109
> URL: https://issues.apache.org/jira/browse/YARN-5109
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
>Priority: Blocker
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-5109-YARN-2928.01.patch, 
> YARN-5109-YARN-2928.02.patch
>
>
> When we store timestamps (for example as part of the row key or part of the 
> column name for an event), the bytes are used as is without any encoding. If 
> the byte value happens to contain a separator character we use (e.g. "!" or 
> "="), it causes a parse failure when we read it.
> I came across this while looking into this error in the timeline reader:
> {noformat}
> 2016-05-17 21:28:38,643 WARN 
> org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils:
>  incorrectly formatted column name: it will be discarded
> {noformat}
> I traced the data that was causing this, and the column name (for the event) 
> was the following:
> {noformat}
> i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST
> {noformat}
> Note that the column name is supposed to be of the format (event 
> id)=(timestamp)=(event info key). However, observe the timestamp portion:
> {noformat}
> \x7F\xFF\xFE\xABDY=\x99
> {noformat}
> The presence of the separator ("=") causes the parse error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors

2016-05-20 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15294601#comment-15294601
 ] 

Varun Saxena commented on YARN-5109:


[~sjlee0],
Yes, type safety will have to be ensured within this encode.
The specific function I am talking about is 
{{TimelineFilterUtils#createFiltersFromColumnQualifiers}}. This is used for 
events and relations. Event filters and relation filters cannot be applied 
using HBase SingleColumnValueFilter so we fetch all the columns specified which 
are there in event filters and relation filters(i.e. events in event filters or 
entity types in relation filters).

I mentioned about {{Object... params}} as that is what came to my mind just 
before signing off for the day.

But on second thoughts, I think we can have a switch case based on column 
prefix and construct EventColumnName from there. We will have only 2 switch 
cases here other than default(i.e. ApplicationColumnPrefix.EVENT and 
EntityColumnPrefix.EVENT). The number of cases should not become humongous in 
this switch case even from a long term perspective. And if it does, we can 
revisit on a solution then.
I will go with this approach now.

> timestamps are stored unencoded causing parse errors
> 
>
> Key: YARN-5109
> URL: https://issues.apache.org/jira/browse/YARN-5109
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
>Priority: Blocker
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-5109-YARN-2928.01.patch, 
> YARN-5109-YARN-2928.02.patch
>
>
> When we store timestamps (for example as part of the row key or part of the 
> column name for an event), the bytes are used as is without any encoding. If 
> the byte value happens to contain a separator character we use (e.g. "!" or 
> "="), it causes a parse failure when we read it.
> I came across this while looking into this error in the timeline reader:
> {noformat}
> 2016-05-17 21:28:38,643 WARN 
> org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils:
>  incorrectly formatted column name: it will be discarded
> {noformat}
> I traced the data that was causing this, and the column name (for the event) 
> was the following:
> {noformat}
> i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST
> {noformat}
> Note that the column name is supposed to be of the format (event 
> id)=(timestamp)=(event info key). However, observe the timestamp portion:
> {noformat}
> \x7F\xFF\xFE\xABDY=\x99
> {noformat}
> The presence of the separator ("=") causes the parse error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors

2016-05-20 Thread Joep Rottinghuis (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15294600#comment-15294600
 ] 

Joep Rottinghuis commented on YARN-5109:


Agreed with [~sjlee0] that we'd like to avoid non-type safe conversions. Would 
like to see where exactly the challenge lies indeed. Perhaps the filters can be 
parameterized. Hard to say without understanding the exact use-case.
I'm sure this ends up as a non-trivial refactor considering the various cases, 
prefix or not, compound column keys, or strings, rowkeys, and on top of that 
filters...

> timestamps are stored unencoded causing parse errors
> 
>
> Key: YARN-5109
> URL: https://issues.apache.org/jira/browse/YARN-5109
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
>Priority: Blocker
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-5109-YARN-2928.01.patch, 
> YARN-5109-YARN-2928.02.patch
>
>
> When we store timestamps (for example as part of the row key or part of the 
> column name for an event), the bytes are used as is without any encoding. If 
> the byte value happens to contain a separator character we use (e.g. "!" or 
> "="), it causes a parse failure when we read it.
> I came across this while looking into this error in the timeline reader:
> {noformat}
> 2016-05-17 21:28:38,643 WARN 
> org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils:
>  incorrectly formatted column name: it will be discarded
> {noformat}
> I traced the data that was causing this, and the column name (for the event) 
> was the following:
> {noformat}
> i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST
> {noformat}
> Note that the column name is supposed to be of the format (event 
> id)=(timestamp)=(event info key). However, observe the timestamp portion:
> {noformat}
> \x7F\xFF\xFE\xABDY=\x99
> {noformat}
> The presence of the separator ("=") causes the parse error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors

2016-05-20 Thread Joep Rottinghuis (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15294595#comment-15294595
 ] 

Joep Rottinghuis commented on YARN-5109:


bq. The main motivation for reversing the user and the cluster in the entity 
table is to accommodate the fact that the table can get real large and we 
wanted to provide good partitioning by using the user dimension rather than the 
cluster dimension. We preserved the original order (cluster and then user) for 
the application table.

Indeed. Aside from sheer size, which would be roughly equal in both cases, we 
especially want to avoid hotspotting during writes with a very high update 
volume. Even if we can keep the total size under control with an expiration 
policy, fact remains that if there is a specifically large cluster (and/or just 
one cluster) the cluster prefix doesn't really help spread the load. If the 
user is first, we at least "salt" the key with the user, so the load gets 
spread across the various users.
Of course the same issue could happen the other way around, if somebody runs 
many clusters and all of them they run jobs emitting entities (with metric time 
series data) as one single user (let's say "hadoop"), we'd still hotspot.

The volume for the application table _should_ be smaller. In a multi-cluster 
setup, the load would also spread there. Range scans per cluster would be more 
efficient over the application table, at the cost of reduced parallelism during 
writes.

bq. The bottom line is that since this was the intended design and nothing is 
broken, we should not revisit it as part of this JIRA. Let me know if that is 
OK with you guys.
+1 agreed.

> timestamps are stored unencoded causing parse errors
> 
>
> Key: YARN-5109
> URL: https://issues.apache.org/jira/browse/YARN-5109
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
>Priority: Blocker
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-5109-YARN-2928.01.patch, 
> YARN-5109-YARN-2928.02.patch
>
>
> When we store timestamps (for example as part of the row key or part of the 
> column name for an event), the bytes are used as is without any encoding. If 
> the byte value happens to contain a separator character we use (e.g. "!" or 
> "="), it causes a parse failure when we read it.
> I came across this while looking into this error in the timeline reader:
> {noformat}
> 2016-05-17 21:28:38,643 WARN 
> org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils:
>  incorrectly formatted column name: it will be discarded
> {noformat}
> I traced the data that was causing this, and the column name (for the event) 
> was the following:
> {noformat}
> i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST
> {noformat}
> Note that the column name is supposed to be of the format (event 
> id)=(timestamp)=(event info key). However, observe the timestamp portion:
> {noformat}
> \x7F\xFF\xFE\xABDY=\x99
> {noformat}
> The presence of the separator ("=") causes the parse error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors

2016-05-20 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15294559#comment-15294559
 ] 

Sangjin Lee commented on YARN-5109:
---

Hmm, could you point to the specific method where the proposed {{byte[] 
encode(T key)}} would not work and we would need one that takes an {{Object}} 
array? That's bit worrisome as {{Object... params}} is not really type-safe, 
and it can be error-prone.

> timestamps are stored unencoded causing parse errors
> 
>
> Key: YARN-5109
> URL: https://issues.apache.org/jira/browse/YARN-5109
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
>Priority: Blocker
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-5109-YARN-2928.01.patch, 
> YARN-5109-YARN-2928.02.patch
>
>
> When we store timestamps (for example as part of the row key or part of the 
> column name for an event), the bytes are used as is without any encoding. If 
> the byte value happens to contain a separator character we use (e.g. "!" or 
> "="), it causes a parse failure when we read it.
> I came across this while looking into this error in the timeline reader:
> {noformat}
> 2016-05-17 21:28:38,643 WARN 
> org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils:
>  incorrectly formatted column name: it will be discarded
> {noformat}
> I traced the data that was causing this, and the column name (for the event) 
> was the following:
> {noformat}
> i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST
> {noformat}
> Note that the column name is supposed to be of the format (event 
> id)=(timestamp)=(event info key). However, observe the timestamp portion:
> {noformat}
> \x7F\xFF\xFE\xABDY=\x99
> {noformat}
> The presence of the separator ("=") causes the parse error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors

2016-05-20 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15294555#comment-15294555
 ] 

Sangjin Lee commented on YARN-5109:
---

[~jrottinghuis], [~vrushalic], and I dug a little bit, and it appears to be 
intentional. See YARN-3906 and YARN-3815 (see [the attached 
doc|https://issues.apache.org/jira/secure/attachment/12743391/hbase-schema-proposal-for-aggregation.pdf]).
 The main motivation for reversing the user and the cluster in the entity table 
is to accommodate the fact that the table can get real large and we wanted to 
provide good partitioning by using the user dimension rather than the cluster 
dimension. We preserved the original order (cluster and then user) for the 
application table.

The bottom line is that since this was the intended design and nothing is 
broken, we should not revisit it as part of this JIRA. Let me know if that is 
OK with you guys.

> timestamps are stored unencoded causing parse errors
> 
>
> Key: YARN-5109
> URL: https://issues.apache.org/jira/browse/YARN-5109
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
>Priority: Blocker
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-5109-YARN-2928.01.patch, 
> YARN-5109-YARN-2928.02.patch
>
>
> When we store timestamps (for example as part of the row key or part of the 
> column name for an event), the bytes are used as is without any encoding. If 
> the byte value happens to contain a separator character we use (e.g. "!" or 
> "="), it causes a parse failure when we read it.
> I came across this while looking into this error in the timeline reader:
> {noformat}
> 2016-05-17 21:28:38,643 WARN 
> org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils:
>  incorrectly formatted column name: it will be discarded
> {noformat}
> I traced the data that was causing this, and the column name (for the event) 
> was the following:
> {noformat}
> i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST
> {noformat}
> Note that the column name is supposed to be of the format (event 
> id)=(timestamp)=(event info key). However, observe the timestamp portion:
> {noformat}
> \x7F\xFF\xFE\xABDY=\x99
> {noformat}
> The presence of the separator ("=") causes the parse error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors

2016-05-20 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15294277#comment-15294277
 ] 

Varun Saxena commented on YARN-5109:


Infact even getColumnPrefixBytes would not work for the code in 
TimelineFilterUtils as we would not know which class the converter will 
take(say, EventColumnName or String). So probably we would need another method 
in converter which takes multiple Objects as parameters and interprets them in 
sequence and either encodes them or returns a key. Something like below. Then 
we can either pass a encoded byte array or the Object which needs to be encoded 
to ColumnPrefix to attach a prefix in front of the qualifier.
{code}
public interface KeyConverter {
  byte[] encode(Object... params);
 OR 
  T createKey(Object...params);
  byte[] encode(T key);
  T decode(byte[] bytes);
}
{code}

Will look at it tomorrow. 

> timestamps are stored unencoded causing parse errors
> 
>
> Key: YARN-5109
> URL: https://issues.apache.org/jira/browse/YARN-5109
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
>Priority: Blocker
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-5109-YARN-2928.01.patch, 
> YARN-5109-YARN-2928.02.patch
>
>
> When we store timestamps (for example as part of the row key or part of the 
> column name for an event), the bytes are used as is without any encoding. If 
> the byte value happens to contain a separator character we use (e.g. "!" or 
> "="), it causes a parse failure when we read it.
> I came across this while looking into this error in the timeline reader:
> {noformat}
> 2016-05-17 21:28:38,643 WARN 
> org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils:
>  incorrectly formatted column name: it will be discarded
> {noformat}
> I traced the data that was causing this, and the column name (for the event) 
> was the following:
> {noformat}
> i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST
> {noformat}
> Note that the column name is supposed to be of the format (event 
> id)=(timestamp)=(event info key). However, observe the timestamp portion:
> {noformat}
> \x7F\xFF\xFE\xABDY=\x99
> {noformat}
> The presence of the separator ("=") causes the parse error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors

2016-05-20 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15294198#comment-15294198
 ] 

Varun Saxena commented on YARN-5109:


Just to update, the patch is almost complete.
Will have it up by

> timestamps are stored unencoded causing parse errors
> 
>
> Key: YARN-5109
> URL: https://issues.apache.org/jira/browse/YARN-5109
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
>Priority: Blocker
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-5109-YARN-2928.01.patch, 
> YARN-5109-YARN-2928.02.patch
>
>
> When we store timestamps (for example as part of the row key or part of the 
> column name for an event), the bytes are used as is without any encoding. If 
> the byte value happens to contain a separator character we use (e.g. "!" or 
> "="), it causes a parse failure when we read it.
> I came across this while looking into this error in the timeline reader:
> {noformat}
> 2016-05-17 21:28:38,643 WARN 
> org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils:
>  incorrectly formatted column name: it will be discarded
> {noformat}
> I traced the data that was causing this, and the column name (for the event) 
> was the following:
> {noformat}
> i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST
> {noformat}
> Note that the column name is supposed to be of the format (event 
> id)=(timestamp)=(event info key). However, observe the timestamp portion:
> {noformat}
> \x7F\xFF\xFE\xABDY=\x99
> {noformat}
> The presence of the separator ("=") causes the parse error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors

2016-05-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15293975#comment-15293975
 ] 

Hadoop QA commented on YARN-5109:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 
21s {color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 47s 
{color} | {color:green} YARN-2928 passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 49s 
{color} | {color:green} YARN-2928 passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
29s {color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 51s 
{color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
37s {color} | {color:green} YARN-2928 passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 15s 
{color} | {color:red} 
branch/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests
 no findbugs output file 
(hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests/target/findbugsXml.xml)
 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 29s 
{color} | {color:green} YARN-2928 passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 32s 
{color} | {color:green} YARN-2928 passed with JDK v1.7.0_101 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
43s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 38s 
{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 38s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 43s 
{color} | {color:green} the patch passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 43s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 29s 
{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server: patch 
generated 19 new + 2 unchanged - 0 fixed = 21 total (was 2) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 50s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
31s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 13s 
{color} | {color:red} 
patch/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests
 no findbugs output file 
(hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests/target/findbugsXml.xml)
 {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 1m 38s 
{color} | {color:red} 
hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-timelineservice-jdk1.8.0_91
 with JDK v1.8.0_91 generated 10 new + 0 unchanged - 0 fixed = 10 total (was 0) 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s 
{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 2m 28s 
{color} | {color:red} 
hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-timelineservice-jdk1.7.0_101
 with JDK v1.7.0_101 generated 2 new + 0 unchanged - 0 fixed = 2 total 

[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors

2016-05-20 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15293713#comment-15293713
 ] 

Sangjin Lee commented on YARN-5109:
---

The {{ApplicationTable}} and {{EntityTable}} javadoc also reflect it, so it 
appears to be intentional, but I may be forgetting something.

> timestamps are stored unencoded causing parse errors
> 
>
> Key: YARN-5109
> URL: https://issues.apache.org/jira/browse/YARN-5109
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
>Priority: Blocker
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-5109-YARN-2928.01.patch, 
> YARN-5109-YARN-2928.02.patch
>
>
> When we store timestamps (for example as part of the row key or part of the 
> column name for an event), the bytes are used as is without any encoding. If 
> the byte value happens to contain a separator character we use (e.g. "!" or 
> "="), it causes a parse failure when we read it.
> I came across this while looking into this error in the timeline reader:
> {noformat}
> 2016-05-17 21:28:38,643 WARN 
> org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils:
>  incorrectly formatted column name: it will be discarded
> {noformat}
> I traced the data that was causing this, and the column name (for the event) 
> was the following:
> {noformat}
> i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST
> {noformat}
> Note that the column name is supposed to be of the format (event 
> id)=(timestamp)=(event info key). However, observe the timestamp portion:
> {noformat}
> \x7F\xFF\xFE\xABDY=\x99
> {noformat}
> The presence of the separator ("=") causes the parse error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors

2016-05-20 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15293705#comment-15293705
 ] 

Sangjin Lee commented on YARN-5109:
---

Hmm, I don't remember there was a reason the application row key had to be 
different from the entity row key. Maybe I'm forgetting something. 
[~vrushalic], [~jrottinghuis], did we intentionally have the application row 
key structure different than the entity row key structure, or is this my error?

> timestamps are stored unencoded causing parse errors
> 
>
> Key: YARN-5109
> URL: https://issues.apache.org/jira/browse/YARN-5109
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
>Priority: Blocker
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-5109-YARN-2928.01.patch, 
> YARN-5109-YARN-2928.02.patch
>
>
> When we store timestamps (for example as part of the row key or part of the 
> column name for an event), the bytes are used as is without any encoding. If 
> the byte value happens to contain a separator character we use (e.g. "!" or 
> "="), it causes a parse failure when we read it.
> I came across this while looking into this error in the timeline reader:
> {noformat}
> 2016-05-17 21:28:38,643 WARN 
> org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils:
>  incorrectly formatted column name: it will be discarded
> {noformat}
> I traced the data that was causing this, and the column name (for the event) 
> was the following:
> {noformat}
> i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST
> {noformat}
> Note that the column name is supposed to be of the format (event 
> id)=(timestamp)=(event info key). However, observe the timestamp portion:
> {noformat}
> \x7F\xFF\xFE\xABDY=\x99
> {noformat}
> The presence of the separator ("=") causes the parse error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors

2016-05-20 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15293666#comment-15293666
 ] 

Varun Saxena commented on YARN-5109:


[~sjlee0], any reason we have clusterid followed by user id in application row 
key and other way round for entity row key. Just noticed while coding.

> timestamps are stored unencoded causing parse errors
> 
>
> Key: YARN-5109
> URL: https://issues.apache.org/jira/browse/YARN-5109
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
>Priority: Blocker
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-5109-YARN-2928.01.patch, 
> YARN-5109-YARN-2928.02.patch
>
>
> When we store timestamps (for example as part of the row key or part of the 
> column name for an event), the bytes are used as is without any encoding. If 
> the byte value happens to contain a separator character we use (e.g. "!" or 
> "="), it causes a parse failure when we read it.
> I came across this while looking into this error in the timeline reader:
> {noformat}
> 2016-05-17 21:28:38,643 WARN 
> org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils:
>  incorrectly formatted column name: it will be discarded
> {noformat}
> I traced the data that was causing this, and the column name (for the event) 
> was the following:
> {noformat}
> i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST
> {noformat}
> Note that the column name is supposed to be of the format (event 
> id)=(timestamp)=(event info key). However, observe the timestamp portion:
> {noformat}
> \x7F\xFF\xFE\xABDY=\x99
> {noformat}
> The presence of the separator ("=") causes the parse error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors

2016-05-20 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15293464#comment-15293464
 ] 

Varun Saxena commented on YARN-5109:


Yes I have taken that patch and working on top of it.

> timestamps are stored unencoded causing parse errors
> 
>
> Key: YARN-5109
> URL: https://issues.apache.org/jira/browse/YARN-5109
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
>Priority: Blocker
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-5109-YARN-2928.01.patch, 
> YARN-5109-YARN-2928.02.patch
>
>
> When we store timestamps (for example as part of the row key or part of the 
> column name for an event), the bytes are used as is without any encoding. If 
> the byte value happens to contain a separator character we use (e.g. "!" or 
> "="), it causes a parse failure when we read it.
> I came across this while looking into this error in the timeline reader:
> {noformat}
> 2016-05-17 21:28:38,643 WARN 
> org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils:
>  incorrectly formatted column name: it will be discarded
> {noformat}
> I traced the data that was causing this, and the column name (for the event) 
> was the following:
> {noformat}
> i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST
> {noformat}
> Note that the column name is supposed to be of the format (event 
> id)=(timestamp)=(event info key). However, observe the timestamp portion:
> {noformat}
> \x7F\xFF\xFE\xABDY=\x99
> {noformat}
> The presence of the separator ("=") causes the parse error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors

2016-05-20 Thread Joep Rottinghuis (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15293458#comment-15293458
 ] 

Joep Rottinghuis commented on YARN-5109:


The code in the patch I attached compiles and clears unit tests so by itself 
should be good to go modulo the items left to do as listed above.

> timestamps are stored unencoded causing parse errors
> 
>
> Key: YARN-5109
> URL: https://issues.apache.org/jira/browse/YARN-5109
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
>Priority: Blocker
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-5109-YARN-2928.01.patch, 
> YARN-5109-YARN-2928.02.patch
>
>
> When we store timestamps (for example as part of the row key or part of the 
> column name for an event), the bytes are used as is without any encoding. If 
> the byte value happens to contain a separator character we use (e.g. "!" or 
> "="), it causes a parse failure when we read it.
> I came across this while looking into this error in the timeline reader:
> {noformat}
> 2016-05-17 21:28:38,643 WARN 
> org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils:
>  incorrectly formatted column name: it will be discarded
> {noformat}
> I traced the data that was causing this, and the column name (for the event) 
> was the following:
> {noformat}
> i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST
> {noformat}
> Note that the column name is supposed to be of the format (event 
> id)=(timestamp)=(event info key). However, observe the timestamp portion:
> {noformat}
> \x7F\xFF\xFE\xABDY=\x99
> {noformat}
> The presence of the separator ("=") causes the parse error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors

2016-05-20 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15293028#comment-15293028
 ] 

Varun Saxena commented on YARN-5109:


Thanks Sangjin and Joep for the pseudocode and prototype.
Now I can clearly get what both of you were alluding to in the meeting. On the 
face of it, this should work in all the cases.

Will check this in detail and hopefully have a concrete patch soon.

> timestamps are stored unencoded causing parse errors
> 
>
> Key: YARN-5109
> URL: https://issues.apache.org/jira/browse/YARN-5109
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
>Priority: Blocker
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-5109-YARN-2928.01.patch, 
> YARN-5109-YARN-2928.02.patch
>
>
> When we store timestamps (for example as part of the row key or part of the 
> column name for an event), the bytes are used as is without any encoding. If 
> the byte value happens to contain a separator character we use (e.g. "!" or 
> "="), it causes a parse failure when we read it.
> I came across this while looking into this error in the timeline reader:
> {noformat}
> 2016-05-17 21:28:38,643 WARN 
> org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils:
>  incorrectly formatted column name: it will be discarded
> {noformat}
> I traced the data that was causing this, and the column name (for the event) 
> was the following:
> {noformat}
> i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST
> {noformat}
> Note that the column name is supposed to be of the format (event 
> id)=(timestamp)=(event info key). However, observe the timestamp portion:
> {noformat}
> \x7F\xFF\xFE\xABDY=\x99
> {noformat}
> The presence of the separator ("=") causes the parse error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors

2016-05-20 Thread Joep Rottinghuis (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15292819#comment-15292819
 ] 

Joep Rottinghuis commented on YARN-5109:


[~varun_saxena] I'm going to briefly assign this jira to me so that I can 
upload a new patch, after uploading it, I'll assign it back to you. It seems 
that when you're not an assignee, you cannot attach a patch. Perhaps one of the 
security measures they have taken I guess.

> timestamps are stored unencoded causing parse errors
> 
>
> Key: YARN-5109
> URL: https://issues.apache.org/jira/browse/YARN-5109
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
>Priority: Blocker
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-5109-YARN-2928.01.patch
>
>
> When we store timestamps (for example as part of the row key or part of the 
> column name for an event), the bytes are used as is without any encoding. If 
> the byte value happens to contain a separator character we use (e.g. "!" or 
> "="), it causes a parse failure when we read it.
> I came across this while looking into this error in the timeline reader:
> {noformat}
> 2016-05-17 21:28:38,643 WARN 
> org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils:
>  incorrectly formatted column name: it will be discarded
> {noformat}
> I traced the data that was causing this, and the column name (for the event) 
> was the following:
> {noformat}
> i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST
> {noformat}
> Note that the column name is supposed to be of the format (event 
> id)=(timestamp)=(event info key). However, observe the timestamp portion:
> {noformat}
> \x7F\xFF\xFE\xABDY=\x99
> {noformat}
> The presence of the separator ("=") causes the parse error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors

2016-05-19 Thread Joep Rottinghuis (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15292696#comment-15292696
 ] 

Joep Rottinghuis commented on YARN-5109:


I have an almost working prototype where we can combined readResults and 
readResultsHavingCompoundColumnQualifiers methods. It also fixes a case where 
we do not properly decode spaces in a column name.
Aside from that it removes the (false) assumptions that if a prefix is null 
then the column qualifiers cannot be compound.
Column qualifiers being compound or not are separate from prefixes being null 
or not.

Also removing the cross-dependency between Separator and TimelineStorageUtils.
Will upload patch (in whatever state I have it) when I log off for tonight.

> timestamps are stored unencoded causing parse errors
> 
>
> Key: YARN-5109
> URL: https://issues.apache.org/jira/browse/YARN-5109
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
>Priority: Blocker
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-5109-YARN-2928.01.patch
>
>
> When we store timestamps (for example as part of the row key or part of the 
> column name for an event), the bytes are used as is without any encoding. If 
> the byte value happens to contain a separator character we use (e.g. "!" or 
> "="), it causes a parse failure when we read it.
> I came across this while looking into this error in the timeline reader:
> {noformat}
> 2016-05-17 21:28:38,643 WARN 
> org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils:
>  incorrectly formatted column name: it will be discarded
> {noformat}
> I traced the data that was causing this, and the column name (for the event) 
> was the following:
> {noformat}
> i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST
> {noformat}
> Note that the column name is supposed to be of the format (event 
> id)=(timestamp)=(event info key). However, observe the timestamp portion:
> {noformat}
> \x7F\xFF\xFE\xABDY=\x99
> {noformat}
> The presence of the separator ("=") causes the parse error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors

2016-05-19 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15292270#comment-15292270
 ] 

Sangjin Lee commented on YARN-5109:
---

Another related issue I see is currently the event id and the info key are not 
encoded as they are joined. I think this is an opportunity to address that. 
Although it is rather unlikely the event id or info key would contain "=", but 
we should be safe than sorry...

> timestamps are stored unencoded causing parse errors
> 
>
> Key: YARN-5109
> URL: https://issues.apache.org/jira/browse/YARN-5109
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
>Priority: Blocker
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-5109-YARN-2928.01.patch
>
>
> When we store timestamps (for example as part of the row key or part of the 
> column name for an event), the bytes are used as is without any encoding. If 
> the byte value happens to contain a separator character we use (e.g. "!" or 
> "="), it causes a parse failure when we read it.
> I came across this while looking into this error in the timeline reader:
> {noformat}
> 2016-05-17 21:28:38,643 WARN 
> org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils:
>  incorrectly formatted column name: it will be discarded
> {noformat}
> I traced the data that was causing this, and the column name (for the event) 
> was the following:
> {noformat}
> i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST
> {noformat}
> Note that the column name is supposed to be of the format (event 
> id)=(timestamp)=(event info key). However, observe the timestamp portion:
> {noformat}
> \x7F\xFF\xFE\xABDY=\x99
> {noformat}
> The presence of the separator ("=") causes the parse error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors

2016-05-19 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15292229#comment-15292229
 ] 

Sangjin Lee commented on YARN-5109:
---

Joep, Vrushali, and I discussed some more offline, and we realized that there 
can be a reasonable pattern we can extract based on the individual 
{{*RowKey.getRowKey()}} and {{*RowKey.parseRowKey()}}, and extend it to cover 
the compound column names also.

We could have an interface that encapsulates the concern of defining the key 
structure and how to encode/decode it. We would still need the new {{split()}} 
method to dictate the low-level parsing.

The following is a proof-of-concept pseudo code I put together in a hurry:

{code}
public interface CompoundKeyConverter {
  byte[] encodeKey(T t);
  T decodeKey(byte[] b);
  int[] sizesOfTokens(); // it's debatable whether this should be part of the 
interface
}

public class ApplicationRowKeyConverter
implements CompoundKeyConverter {
  public static final int[] SIZES =
  {0, 0, 0, Long.BYTES, Long.BYTES + Integer.BYTES};
  public int[] sizesOfTokens() {
return SIZES;
  }

  // almost identical to ApplicationRowKey.getRowKey()
  public byte[] encodeKey(ApplicationRowKey key) {
byte[] first =
Bytes.toBytes(Separator.QUALIFIERS.joinEncoded(key.getClusterId(),
key.getUserId(), key.getFlowName()));
// Note that flowRunId is a long, so we can't encode them all at the same
// time.
byte[] second =
Bytes.toBytes(TimelineStorageUtils.invertLong(key.getFlowRunId()));
byte[] third = TimelineStorageUtils.encodeAppId(key.getAppId());
return Separator.QUALIFIERS.join(first, second, third);
  }

  // almost identical to ApplicationRowKey.parseRowKey()
  public ApplicationRowKey decodeKey(byte[] data) {
// use the new version of the split method that takes the sizes
byte[][] rowKeyComponents =
Separator.QUALIFIERS.split(data, sizesOfTokens());

if (rowKeyComponents.length < 5) {
  throw new IllegalArgumentException("the row key is not valid for " +
  "an application");
}

String clusterId =
Separator.QUALIFIERS.decode(Bytes.toString(rowKeyComponents[0]));
String userId =
Separator.QUALIFIERS.decode(Bytes.toString(rowKeyComponents[1]));
String flowName =
Separator.QUALIFIERS.decode(Bytes.toString(rowKeyComponents[2]));
long flowRunId =
TimelineStorageUtils.invertLong(Bytes.toLong(rowKeyComponents[3]));
String appId = TimelineStorageUtils.decodeAppId(rowKeyComponents[4]);
return new ApplicationRowKey(clusterId, userId, flowName, flowRunId, appId);
  }
}

public class EventColumnName {
  private final String id;
  private final long timestamp;
  private final String infoKey;

  public EventColumnName(String id, long timestamp, String infoKey) {
this.id = id;
this.timestamp = timestamp;
this.infoKey = infoKey;
  }

  public String getId() {
return id;
  }

  public long getTimestamp() {
return timestamp;
  }

  public String getInfoKey() {
return infoKey;
  }
}

public class EventColumnNameConverter
implements CompoundKeyConverter {
  private static final int[] SIZES = {0, Long.BYTES, 0};
  public int[] sizesOfTokens() {
return SIZES;
  }

  // from HBaseTimelineWriterImpl.storeEvents()
  public byte[] encodeKey(EventColumnName q) {
return ColumnHelper.getCompoundColumnQualifierBytes(q.getId(),
Bytes.toBytes(TimelineStorageUtils.invertLong(q.getTimestamp())),
Bytes.toBytes(infoKey));
  }

  public EventColumnName decodeKey(byte[] data) {
// use the new split() method that takes the sizes
byte[][] components = Separator.VALUES.split(data, sizesOfTokens());
if (components.length != 3) {
  throw new IllegalArgumentException("the column name is not valid");
}

String id = Bytes.toString(components[0]);
long ts = TimelineStorageUtils.invertLong(Bytes.toLong(components[1]));
String infoKey = components[2].length == 0 ? null : 
Bytes.toString(components[2]);
return new EventColumnName(id, ts, infoKey);
  }
}

HBaseTimelineWriterImpl.java:
private void storeEvents(...) {
  ...
  CompoundKeyConverter converter =
  new EventColumnNameConverter();
  EventColumnName e =
  new EventColumnName(eventId, eventTimestamp, info.getKey());
  byte[] compoundColumnQualifierBytes = converter.encodeKey(e);
  ...
}

TimelineStorageUtils.java:
public static  void readEvents(...) {
  ...
  CompoundKeyConverter converter =
  new EventColumnNameConverter();
  // ColumnPrefix API should change as well
  Map eventsResult =
  prefix.readResultsHavingCompoundColumnQualifiers(result, converter);
  for (Map.Entry eventResult : eventsResult.entrySet()) 
{
...
  }
}

ColumnHelper.java:
public  Map readResultsHavingCompoundColumnQualifiers(Result 
result,

[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors

2016-05-19 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15291983#comment-15291983
 ] 

Varun Saxena commented on YARN-5109:


Writing down what we discussed in the meeting with regards to this.

I have attached a WIP patch of whatever I had done so far.
Some things may have to be fine tuned(null checks etc.). Probably both limit 
and sizes need not be passed into splitRanges. But anyways this works well for 
row keys as per my tests.

For column qualifiers,we could either do encoding of bytes(representing longs, 
ints, app ids') before writing them into backend or use the same approach which 
we adopted for row keys i.e. do not encode but split by ignoring separators for 
longs, ints, etc. till they are fully read.
I had a patch for former too but consensus seems to be towards latter.

Sangjin and Joep thought that we should reorganize our row key and column 
qualifier parsing logic into a set of converters. It was decided that this 
approach will be explored further.

> timestamps are stored unencoded causing parse errors
> 
>
> Key: YARN-5109
> URL: https://issues.apache.org/jira/browse/YARN-5109
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
>Priority: Blocker
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-5109-YARN-2928.01.patch
>
>
> When we store timestamps (for example as part of the row key or part of the 
> column name for an event), the bytes are used as is without any encoding. If 
> the byte value happens to contain a separator character we use (e.g. "!" or 
> "="), it causes a parse failure when we read it.
> I came across this while looking into this error in the timeline reader:
> {noformat}
> 2016-05-17 21:28:38,643 WARN 
> org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils:
>  incorrectly formatted column name: it will be discarded
> {noformat}
> I traced the data that was causing this, and the column name (for the event) 
> was the following:
> {noformat}
> i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST
> {noformat}
> Note that the column name is supposed to be of the format (event 
> id)=(timestamp)=(event info key). However, observe the timestamp portion:
> {noformat}
> \x7F\xFF\xFE\xABDY=\x99
> {noformat}
> The presence of the separator ("=") causes the parse error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors

2016-05-19 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15291529#comment-15291529
 ] 

Varun Saxena commented on YARN-5109:


Sure. Lets discuss this in the meeting.
Basically column qualifiers are read in ColumnHelper, which is a common piece 
of code, completely unaware of details of specific column qualifiers.
So we may have to pass this size information somehow to ColumnHelper but this 
will increase the constructor size.

Or we can consider going with encoding bytes. Wasn't considering escaping 
existing patterns though, which more than string maybe a problem with longs 
though in terms of possibility of appearing.

> timestamps are stored unencoded causing parse errors
> 
>
> Key: YARN-5109
> URL: https://issues.apache.org/jira/browse/YARN-5109
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
>Priority: Blocker
>  Labels: yarn-2928-1st-milestone
>
> When we store timestamps (for example as part of the row key or part of the 
> column name for an event), the bytes are used as is without any encoding. If 
> the byte value happens to contain a separator character we use (e.g. "!" or 
> "="), it causes a parse failure when we read it.
> I came across this while looking into this error in the timeline reader:
> {noformat}
> 2016-05-17 21:28:38,643 WARN 
> org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils:
>  incorrectly formatted column name: it will be discarded
> {noformat}
> I traced the data that was causing this, and the column name (for the event) 
> was the following:
> {noformat}
> i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST
> {noformat}
> Note that the column name is supposed to be of the format (event 
> id)=(timestamp)=(event info key). However, observe the timestamp portion:
> {noformat}
> \x7F\xFF\xFE\xABDY=\x99
> {noformat}
> The presence of the separator ("=") causes the parse error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors

2016-05-19 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15291498#comment-15291498
 ] 

Sangjin Lee commented on YARN-5109:
---

Good point about the column qualifiers not needing the correct order. One other 
complication with the column qualifiers is that the call chain is several 
levels. The changes to use the new {{split()}} method there would be bit bigger.

If we were to go the route of encoding the bytes as well, there is one other 
issue we need to be mindful of. We need to guard against the occurrences of the 
encoded equivalent in the original bytes. For example, "=" would be encoded 
into "%1$". A problem would arise if the original bytes already contained "%1$" 
however unlikely that may be. Consider the following original bytes (totally 
made up with ascii characters):
{noformat}
t=h%1$ig
{noformat}

If we simply encode "=", then we get
{noformat}
t%1$h%1$ig
{noformat}

Now, if we read this back and decode it, we would decode it to
{noformat}
t=h=ig
{noformat}

To do this properly, we'd need to "escape" the existing patterns *before* 
encoding for the separator. The reverse should be done when decoding it.

To be clear, this is an existing issue (even with strings). We went ahead 
without treating for this as we felt that this is unlikely to occur in a 
string. But if we're going to revisit encoding, we might want to address that 
as well.

We can discuss the details offline if needed.

> timestamps are stored unencoded causing parse errors
> 
>
> Key: YARN-5109
> URL: https://issues.apache.org/jira/browse/YARN-5109
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
>Priority: Blocker
>  Labels: yarn-2928-1st-milestone
>
> When we store timestamps (for example as part of the row key or part of the 
> column name for an event), the bytes are used as is without any encoding. If 
> the byte value happens to contain a separator character we use (e.g. "!" or 
> "="), it causes a parse failure when we read it.
> I came across this while looking into this error in the timeline reader:
> {noformat}
> 2016-05-17 21:28:38,643 WARN 
> org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils:
>  incorrectly formatted column name: it will be discarded
> {noformat}
> I traced the data that was causing this, and the column name (for the event) 
> was the following:
> {noformat}
> i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST
> {noformat}
> Note that the column name is supposed to be of the format (event 
> id)=(timestamp)=(event info key). However, observe the timestamp portion:
> {noformat}
> \x7F\xFF\xFE\xABDY=\x99
> {noformat}
> The presence of the separator ("=") causes the parse error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors

2016-05-19 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15290991#comment-15290991
 ] 

Varun Saxena commented on YARN-5109:


Thanks [~sjlee0].
I did realize this issue with timestamps in row keys. Was trying a way around 
for this kind of scenario with row keys by handling it on parse side in 
individual row key classes(i.e. do not encode and on read side consider 
separator as part of long/int, if i hasnt been read as yet).
But your suggestion of split looks great. This reduces the changes. Will code 
according to this.

For column qualifier though, this change isnt required and we can actually do 
encoding because for columns we will either have single column value filter or 
qualifier filter with prefixes applied. If I am not wrong, we do not need to 
preserve ordering for column qualifiers.  

However, we can keep it consistent because split method can be changed as per 
your suggestion.
Will remove the encoding bytes routine then. 

> timestamps are stored unencoded causing parse errors
> 
>
> Key: YARN-5109
> URL: https://issues.apache.org/jira/browse/YARN-5109
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
>Priority: Blocker
>  Labels: yarn-2928-1st-milestone
>
> When we store timestamps (for example as part of the row key or part of the 
> column name for an event), the bytes are used as is without any encoding. If 
> the byte value happens to contain a separator character we use (e.g. "!" or 
> "="), it causes a parse failure when we read it.
> I came across this while looking into this error in the timeline reader:
> {noformat}
> 2016-05-17 21:28:38,643 WARN 
> org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils:
>  incorrectly formatted column name: it will be discarded
> {noformat}
> I traced the data that was causing this, and the column name (for the event) 
> was the following:
> {noformat}
> i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST
> {noformat}
> Note that the column name is supposed to be of the format (event 
> id)=(timestamp)=(event info key). However, observe the timestamp portion:
> {noformat}
> \x7F\xFF\xFE\xABDY=\x99
> {noformat}
> The presence of the separator ("=") causes the parse error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors

2016-05-18 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15290166#comment-15290166
 ] 

Sangjin Lee commented on YARN-5109:
---

Here is a proposal. Instead of blindly splitting along the separator 
boundaries, we need to be able to tell the Separator to tokenize using the size 
also. The following is a quick prototype I put together to show the concept. 
This is an overloaded {{split()}} method:

{code:title=TimelineStorageUtils.java|borderStyle=solid}
  public static final int NO_LIMIT = 0;

  public static List splitRanges(byte[] source, byte[] separator,
  int[] sizes) {
List segments = new ArrayList();
if (source == null || separator == null || sizes == null) {
  return segments;
}
int start = 0;
int i = 0;
int k = 0;
itersource: while (i < source.length && segments.size() < sizes.length) {
  int currentTokenSize = sizes[k];
  if (currentTokenSize > NO_LIMIT) {
// we explicitly grab a fixed number of bytes
if (start + currentTokenSize > source.length) {
  // it's seeking beyond the source boundary
  throw new IllegalArgumentException("source is " + source.length +
  " bytes long and we're asking for " + (start + currentTokenSize));
}
segments.add(new Range(start, start + currentTokenSize));
start += currentTokenSize;
i += currentTokenSize;
k++;
// if there is more to parse, there must be a separator; strip it
if (k <= sizes.length - 1) {
  for (int j = 0; j < separator.length; j++) {
if (source[i + j] != separator[j]) {
  throw new IllegalArgumentException("separator is expected");
}
continue;
  }
  // matched the separator
  start = i + separator.length;
  i += separator.length;
}
  } else if (currentTokenSize == NO_LIMIT) { // use the separator
// continue until we match the separator
for (int j = 0; j < separator.length; j++) {
  if (source[i + j] != separator[j]) {
i++;
continue itersource;
  }
  // we just matched all separator elements
  segments.add(new Range(start, i));
  start = i + separator.length;
  i += separator.length;
  k++;
}
  } else {
throw new IllegalArgumentException("negative size provided");
  }
}
// add the final segment
if (start < source.length && segments.size() < sizes.length) {
  // by deduction this can happen only if the token size = NO_LIMIT
  segments.add(new Range(start, source.length));
}
return segments;
  }
{code}

You can basically instruct the utility if you expect certain size tokens when 
it parses the bytes. Value = 0 indicates the existing parsing behavior (parse 
until you hit the separator). Positive values indicate the number of bytes to 
grab whether or not there is a separator in the middle. For example, if we 
expect the structure to be a string, a long (as bytes), and a string you can 
invoke it with \{0, Long.BYTES, 0\}.

That way, we can dictate how the row keys and column name qualifiers should be 
parsed.

> timestamps are stored unencoded causing parse errors
> 
>
> Key: YARN-5109
> URL: https://issues.apache.org/jira/browse/YARN-5109
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
>Priority: Blocker
>  Labels: yarn-2928-1st-milestone
>
> When we store timestamps (for example as part of the row key or part of the 
> column name for an event), the bytes are used as is without any encoding. If 
> the byte value happens to contain a separator character we use (e.g. "!" or 
> "="), it causes a parse failure when we read it.
> I came across this while looking into this error in the timeline reader:
> {noformat}
> 2016-05-17 21:28:38,643 WARN 
> org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils:
>  incorrectly formatted column name: it will be discarded
> {noformat}
> I traced the data that was causing this, and the column name (for the event) 
> was the following:
> {noformat}
> i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST
> {noformat}
> Note that the column name is supposed to be of the format (event 
> id)=(timestamp)=(event info key). However, observe the timestamp portion:
> {noformat}
> \x7F\xFF\xFE\xABDY=\x99
> {noformat}
> The presence of the separator ("=") causes the parse error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, 

[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors

2016-05-18 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15289895#comment-15289895
 ] 

Sangjin Lee commented on YARN-5109:
---

I spoke with [~jrottinghuis] offline about this. Initially we were thinking 
that we should encode those characters even in the case of bytes (essentially 
creating {{Separator.joinEncoded(byte[]...)}} and removing the raw 
{{Separator.join()}} method), but we are realizing that won't work.

The key here is that we not only need to handle those separator characters 
("=", "!", etc.) but also *preserve the ordering*. For example, suppose we have 
two timestamps ({{ts1}} and {{ts2}}) where {{ts1 < ts2}}. And assume {{ts2}} 
has a separator character in it. If we blindly encoded the separator character, 
we could easily violate {{ts1 < ts2}} once they are written. This would break 
all sorts of things, including range scans.

My proposal is this. In almost all of these cases, the structure of the data 
we're storing and parsing is known strongly, whether it is the row key or the 
column qualifier. The problem with the current parsing is it uses solely 
splitting by separator. We should use the full data structure it knows already 
to parse correctly.

For example, if we know that the structure is (string)=(timestamp)=(string), we 
can parse the first string, and then take the next 8 bytes *without splitting 
again* as we know it's a timestamp anyway and convert it into the long number, 
and take the last token after that. We should be able to follow the same idea 
in all cases.

Thoughts? [~varun_saxena], let me know if you'd like to take a stab at that 
idea, or I should.

> timestamps are stored unencoded causing parse errors
> 
>
> Key: YARN-5109
> URL: https://issues.apache.org/jira/browse/YARN-5109
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
>Priority: Blocker
>  Labels: yarn-2928-1st-milestone
>
> When we store timestamps (for example as part of the row key or part of the 
> column name for an event), the bytes are used as is without any encoding. If 
> the byte value happens to contain a separator character we use (e.g. "!" or 
> "="), it causes a parse failure when we read it.
> I came across this while looking into this error in the timeline reader:
> {noformat}
> 2016-05-17 21:28:38,643 WARN 
> org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils:
>  incorrectly formatted column name: it will be discarded
> {noformat}
> I traced the data that was causing this, and the column name (for the event) 
> was the following:
> {noformat}
> i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST
> {noformat}
> Note that the column name is supposed to be of the format (event 
> id)=(timestamp)=(event info key). However, observe the timestamp portion:
> {noformat}
> \x7F\xFF\xFE\xABDY=\x99
> {noformat}
> The presence of the separator ("=") causes the parse error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors

2016-05-18 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15289571#comment-15289571
 ] 

Sangjin Lee commented on YARN-5109:
---

These are the all the cases of "naked" joins:
{noformat}org.apache.hadoop.yarn.server.timelineservice.storage.common.ColumnHelper.getColumnQualifier(byte[],
 byte[])
org.apache.hadoop.yarn.server.timelineservice.storage.common.ColumnHelper.getColumnQualifier(byte[],
 long)
org.apache.hadoop.yarn.server.timelineservice.storage.common.ColumnHelper.getColumnQualifier(byte[],
 String)
org.apache.hadoop.yarn.server.timelineservice.storage.common.ColumnHelper.getCompoundColumnQualifierBytes(String,
 byte[]...)
org.apache.hadoop.yarn.server.timelineservice.storage.flow.FlowActivityRowKey.getRowKey(String,
 long, String, String)
org.apache.hadoop.yarn.server.timelineservice.storage.entity.EntityRowKey.getRowKey(String,
 String, String, Long, String, String, String)
org.apache.hadoop.yarn.server.timelineservice.storage.application.ApplicationRowKey.getRowKey(String,
 String, String, Long, String)
org.apache.hadoop.yarn.server.timelineservice.storage.flow.FlowRunRowKey.getRowKey(String,
 String, String, Long)
org.apache.hadoop.yarn.server.timelineservice.storage.apptoflow.AppToFlowRowKey.getRowKey(String,
 String)
org.apache.hadoop.yarn.server.timelineservice.storage.flow.FlowActivityRowKey.getRowKeyPrefix(String,
 long)
org.apache.hadoop.yarn.server.timelineservice.storage.entity.EntityRowKey.getRowKeyPrefix(String,
 String, String, Long, String, String)
org.apache.hadoop.yarn.server.timelineservice.storage.entity.EntityRowKey.getRowKeyPrefix(String,
 String, String, Long, String)
org.apache.hadoop.yarn.server.timelineservice.storage.application.ApplicationRowKey.getRowKeyPrefix(String,
 String, String, Long)
org.apache.hadoop.yarn.server.timelineservice.storage.application.ApplicationRowKey.getRowKeyPrefix(String,
 String, String)
{noformat}

> timestamps are stored unencoded causing parse errors
> 
>
> Key: YARN-5109
> URL: https://issues.apache.org/jira/browse/YARN-5109
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
>Priority: Blocker
>  Labels: yarn-2928-1st-milestone
>
> When we store timestamps (for example as part of the row key or part of the 
> column name for an event), the bytes are used as is without any encoding. If 
> the byte value happens to contain a separator character we use (e.g. "!" or 
> "="), it causes a parse failure when we read it.
> I came across this while looking into this error in the timeline reader:
> {noformat}
> 2016-05-17 21:28:38,643 WARN 
> org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils:
>  incorrectly formatted column name: it will be discarded
> {noformat}
> I traced the data that was causing this, and the column name (for the event) 
> was the following:
> {noformat}
> i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST
> {noformat}
> Note that the column name is supposed to be of the format (event 
> id)=(timestamp)=(event info key). However, observe the timestamp portion:
> {noformat}
> \x7F\xFF\xFE\xABDY=\x99
> {noformat}
> The presence of the separator ("=") causes the parse error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors

2016-05-18 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15289237#comment-15289237
 ] 

Varun Saxena commented on YARN-5109:


Thanks [~sjlee0].
Yes, I am in the process of coding for this i.e. part regarding column 
qualifiers. Have to look at row keys too. I had written a UT too to simulate 
the failure before assigning JIRA to myself.
Will have a patch by tomorrow.

You are correct that we need to encode in getColumnQualifier too because while 
reading it back we do a split upto 2 segments for non compound column 
qualifiers(for fetching info).

> timestamps are stored unencoded causing parse errors
> 
>
> Key: YARN-5109
> URL: https://issues.apache.org/jira/browse/YARN-5109
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
>Priority: Blocker
>  Labels: yarn-2928-1st-milestone
>
> When we store timestamps (for example as part of the row key or part of the 
> column name for an event), the bytes are used as is without any encoding. If 
> the byte value happens to contain a separator character we use (e.g. "!" or 
> "="), it causes a parse failure when we read it.
> I came across this while looking into this error in the timeline reader:
> {noformat}
> 2016-05-17 21:28:38,643 WARN 
> org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils:
>  incorrectly formatted column name: it will be discarded
> {noformat}
> I traced the data that was causing this, and the column name (for the event) 
> was the following:
> {noformat}
> i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST
> {noformat}
> Note that the column name is supposed to be of the format (event 
> id)=(timestamp)=(event info key). However, observe the timestamp portion:
> {noformat}
> \x7F\xFF\xFE\xABDY=\x99
> {noformat}
> The presence of the separator ("=") causes the parse error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors

2016-05-18 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15289207#comment-15289207
 ] 

Sangjin Lee commented on YARN-5109:
---

Thanks for looking into this [~varun_saxena]. It might be good to write a small 
unit test for this. FYI, the offending timestamp in the above example is 
1463437148774.

I think the scope of this issue is rather big unfortunately. If I'm not 
mistaken, any time a byte array is passed into {{Separator.join()}} without 
encoding we need to encode it. Of course it also means we need to decode it on 
the way out.

For example, {{ColumnHelper.getColumnQualifier()}} accepts both 
{{columnPrefixBytes}} and {{qualifier}} as is and joins them.

I can work with you to identify all the places where this is a problem and also 
the solution. Let's discuss it here on the JIRA.

> timestamps are stored unencoded causing parse errors
> 
>
> Key: YARN-5109
> URL: https://issues.apache.org/jira/browse/YARN-5109
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
>Priority: Blocker
>  Labels: yarn-2928-1st-milestone
>
> When we store timestamps (for example as part of the row key or part of the 
> column name for an event), the bytes are used as is without any encoding. If 
> the byte value happens to contain a separator character we use (e.g. "!" or 
> "="), it causes a parse failure when we read it.
> I came across this while looking into this error in the timeline reader:
> {noformat}
> 2016-05-17 21:28:38,643 WARN 
> org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils:
>  incorrectly formatted column name: it will be discarded
> {noformat}
> I traced the data that was causing this, and the column name (for the event) 
> was the following:
> {noformat}
> i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST
> {noformat}
> Note that the column name is supposed to be of the format (event 
> id)=(timestamp)=(event info key). However, observe the timestamp portion:
> {noformat}
> \x7F\xFF\xFE\xABDY=\x99
> {noformat}
> The presence of the separator ("=") causes the parse error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5109) timestamps are stored unencoded causing parse errors

2016-05-17 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15288444#comment-15288444
 ] 

Sangjin Lee commented on YARN-5109:
---

For example, code that creates the column name for events:
{code}
byte[] eventTs =
Bytes.toBytes(TimelineStorageUtils.invertLong(eventTimestamp));
...
byte[] compoundColumnQualifierBytes =
EntityColumnPrefix.EVENT.
getCompoundColQualBytes(eventId, eventTs, null);
...
public static byte[] getCompoundColumnQualifierBytes(String qualifier,
byte[]...components) {
  byte[] colQualBytes = Bytes.toBytes(Separator.VALUES.encode(qualifier));
  for (int i = 0; i < components.length; i++) {
colQualBytes = Separator.VALUES.join(colQualBytes, components[i]);
  }
  return colQualBytes;
}
{code}
The {{getCompoundColumnQualifierBytes()}} method uses the bytes from the 
timestamp as is without any encoding for VALUES ({{\x3d}}).

I believe a similar issue exists with row keys. In most cases, long's are 
passed to the row key without any encoding for QUALIFIERS. If any of the byte 
values happens to be QUALIFIERS ({{\x21}}), it will cause the row key parsing 
to fail.

> timestamps are stored unencoded causing parse errors
> 
>
> Key: YARN-5109
> URL: https://issues.apache.org/jira/browse/YARN-5109
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Priority: Blocker
>  Labels: yarn-2928-1st-milestone
>
> When we store timestamps (for example as part of the row key or part of the 
> column name for an event), the bytes are used as is without any encoding. If 
> the byte value happens to contain a separator character we use (e.g. "!" or 
> "="), it causes a parse failure when we read it.
> I came across this while looking into this error in the timeline reader:
> {noformat}
> 2016-05-17 21:28:38,643 WARN 
> org.apache.hadoop.yarn.server.timelineservice.storage.common.TimelineStorageUtils:
>  incorrectly formatted column name: it will be discarded
> {noformat}
> I traced the data that was causing this, and the column name (for the event) 
> was the following:
> {noformat}
> i:e!YARN_RM_CONTAINER_CREATED=\x7F\xFF\xFE\xABDY=\x99=YARN_CONTAINER_ALLOCATED_HOST
> {noformat}
> Note that the column name is supposed to be of the format (event 
> id)=(timestamp)=(event info key). However, observe the timestamp portion:
> {noformat}
> \x7F\xFF\xFE\xABDY=\x99
> {noformat}
> The presence of the separator ("=") causes the parse error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org