subject:"\[jira\] \[Commented\] \(YARN\-3049\) \[Storage Implementation\] Implement storage reader interface to fetch raw data from HBase backend"

[jira] [Commented] (YARN-3049) [Storage Implementation] Implement storage reader interface to fetch raw data from HBase backend

2016-07-10 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15369761#comment-15369761
 ] 

Hudson commented on YARN-3049:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #10074 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/10074/])
YARN-3049. [Storage Implementation] Implement storage reader interface (sjlee: 
rev 9e5155be363c6610ccf41fe08b7f1394f353ea65)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/entity/EntityColumnPrefix.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/apptoflow/AppToFlowColumn.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/timelineservice/TimelineEntity.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/apptoflow/package-info.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/common/TimelineEntitySchemaConstants.java
* hadoop-yarn-project/hadoop-yarn/dev-support/findbugs-exclude.xml
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/TimelineSchemaCreator.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/common/BaseTable.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/HBaseTimelineReaderImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/common/TimelineReaderUtils.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/apptoflow/AppToFlowRowKey.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/common/ColumnPrefix.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/apptoflow/AppToFlowColumnFamily.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/apptoflow/AppToFlowTable.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/FileSystemTimelineReaderImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/entity/EntityRowKey.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/test/java/org/apache/hadoop/yarn/server/timelineservice/storage/TestHBaseTimelineWriterImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/entity/EntityTable.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/HBaseTimelineWriterImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/entity/EntityColumnFamily.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/entity/EntityColumn.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/common/TimelineHBaseSchemaConstants.java


> [Storage Implementation] Implement storage reader interface to fetch raw data 
> from HBase backend
> 
>
> Key: YARN-3049
> URL: https://issues.apache.org/jira/browse/YARN-3049
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Zhijie Shen
> Fix For: YARN-2928
>
> Attachments: YARN-3049-WIP.1.patch, YARN-

[jira] [Commented] (YARN-3049) [Storage Implementation] Implement storage reader interface to fetch raw data from HBase backend

2015-08-07 Thread Junping Du (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14662068#comment-14662068
 ] 

Junping Du commented on YARN-3049:
--

+1. Patch LGTM. [~sjlee0], please feel free to go ahead to check in latest 
patch. Thx!

> [Storage Implementation] Implement storage reader interface to fetch raw data 
> from HBase backend
> 
>
> Key: YARN-3049
> URL: https://issues.apache.org/jira/browse/YARN-3049
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Zhijie Shen
> Attachments: YARN-3049-WIP.1.patch, YARN-3049-WIP.2.patch, 
> YARN-3049-WIP.3.patch, YARN-3049-YARN-2928.2.patch, 
> YARN-3049-YARN-2928.3.patch, YARN-3049-YARN-2928.4.patch, 
> YARN-3049-YARN-2928.5.patch, YARN-3049-YARN-2928.6.patch, 
> YARN-3049-YARN-2928.7.patch
>
>
> Implement existing ATS queries with the new ATS reader design.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3049) [Storage Implementation] Implement storage reader interface to fetch raw data from HBase backend

2015-08-07 Thread Sangjin Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14662060#comment-14662060
 ] 

Sangjin Lee commented on YARN-3049:
---

Let me know if there is any additional comments. I'll wait for about an hour 
before committing this. Thanks.

> [Storage Implementation] Implement storage reader interface to fetch raw data 
> from HBase backend
> 
>
> Key: YARN-3049
> URL: https://issues.apache.org/jira/browse/YARN-3049
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Zhijie Shen
> Attachments: YARN-3049-WIP.1.patch, YARN-3049-WIP.2.patch, 
> YARN-3049-WIP.3.patch, YARN-3049-YARN-2928.2.patch, 
> YARN-3049-YARN-2928.3.patch, YARN-3049-YARN-2928.4.patch, 
> YARN-3049-YARN-2928.5.patch, YARN-3049-YARN-2928.6.patch, 
> YARN-3049-YARN-2928.7.patch
>
>
> Implement existing ATS queries with the new ATS reader design.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3049) [Storage Implementation] Implement storage reader interface to fetch raw data from HBase backend

2015-08-06 Thread Vrushali C (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14661313#comment-14661313
 ] 

Vrushali C commented on YARN-3049:
--

Filed  https://issues.apache.org/jira/browse/YARN-4025 for all the 
timestamp/long/byte to string etc conversions and adding in other apis and 
functions as needed to support the conversions/argument passing. 

> [Storage Implementation] Implement storage reader interface to fetch raw data 
> from HBase backend
> 
>
> Key: YARN-3049
> URL: https://issues.apache.org/jira/browse/YARN-3049
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Zhijie Shen
> Attachments: YARN-3049-WIP.1.patch, YARN-3049-WIP.2.patch, 
> YARN-3049-WIP.3.patch, YARN-3049-YARN-2928.2.patch, 
> YARN-3049-YARN-2928.3.patch, YARN-3049-YARN-2928.4.patch, 
> YARN-3049-YARN-2928.5.patch, YARN-3049-YARN-2928.6.patch, 
> YARN-3049-YARN-2928.7.patch
>
>
> Implement existing ATS queries with the new ATS reader design.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3049) [Storage Implementation] Implement storage reader interface to fetch raw data from HBase backend

2015-08-06 Thread Sangjin Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14661012#comment-14661012
 ] 

Sangjin Lee commented on YARN-3049:
---

Yes, +1 with proceeding with this patch and addressing the long conversion in 
another JIRA.

> [Storage Implementation] Implement storage reader interface to fetch raw data 
> from HBase backend
> 
>
> Key: YARN-3049
> URL: https://issues.apache.org/jira/browse/YARN-3049
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Zhijie Shen
> Attachments: YARN-3049-WIP.1.patch, YARN-3049-WIP.2.patch, 
> YARN-3049-WIP.3.patch, YARN-3049-YARN-2928.2.patch, 
> YARN-3049-YARN-2928.3.patch, YARN-3049-YARN-2928.4.patch, 
> YARN-3049-YARN-2928.5.patch, YARN-3049-YARN-2928.6.patch, 
> YARN-3049-YARN-2928.7.patch
>
>
> Implement existing ATS queries with the new ATS reader design.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3049) [Storage Implementation] Implement storage reader interface to fetch raw data from HBase backend

2015-08-06 Thread Li Lu (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14661007#comment-14661007
 ] 

Li Lu commented on YARN-3049:
-

I checked EntityRowKey.java and seems like we never convert flowRunIds into 
Strings when forming a row key. I think we're fine since we always treat row 
keys as byte arrays? 

> [Storage Implementation] Implement storage reader interface to fetch raw data 
> from HBase backend
> 
>
> Key: YARN-3049
> URL: https://issues.apache.org/jira/browse/YARN-3049
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Zhijie Shen
> Attachments: YARN-3049-WIP.1.patch, YARN-3049-WIP.2.patch, 
> YARN-3049-WIP.3.patch, YARN-3049-YARN-2928.2.patch, 
> YARN-3049-YARN-2928.3.patch, YARN-3049-YARN-2928.4.patch, 
> YARN-3049-YARN-2928.5.patch, YARN-3049-YARN-2928.6.patch, 
> YARN-3049-YARN-2928.7.patch
>
>
> Implement existing ATS queries with the new ATS reader design.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3049) [Storage Implementation] Implement storage reader interface to fetch raw data from HBase backend

2015-08-06 Thread Vrushali C (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14660991#comment-14660991
 ] 

Vrushali C commented on YARN-3049:
--

bq. I'm worried that Bytes.toString() doesn't make the long integer be stored 
as the way we want. 
Yes, when we have a long value being stored, we need to store it as 
Bytes.toBytes(Long) not as a Bytes.toBytes(Long value as String). When it is 
stored as long, it will be stored sorted as per numerical sort. 

The same applies to row key. We need to ensure we store Long as 
Bytes.toBytes(Long) to ensure numerically sorted order. 


> [Storage Implementation] Implement storage reader interface to fetch raw data 
> from HBase backend
> 
>
> Key: YARN-3049
> URL: https://issues.apache.org/jira/browse/YARN-3049
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Zhijie Shen
> Attachments: YARN-3049-WIP.1.patch, YARN-3049-WIP.2.patch, 
> YARN-3049-WIP.3.patch, YARN-3049-YARN-2928.2.patch, 
> YARN-3049-YARN-2928.3.patch, YARN-3049-YARN-2928.4.patch, 
> YARN-3049-YARN-2928.5.patch, YARN-3049-YARN-2928.6.patch, 
> YARN-3049-YARN-2928.7.patch
>
>
> Implement existing ATS queries with the new ATS reader design.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3049) [Storage Implementation] Implement storage reader interface to fetch raw data from HBase backend

2015-08-06 Thread Vrushali C (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14660987#comment-14660987
 ] 

Vrushali C commented on YARN-3049:
--


Yes I will take that jira up. 

> [Storage Implementation] Implement storage reader interface to fetch raw data 
> from HBase backend
> 
>
> Key: YARN-3049
> URL: https://issues.apache.org/jira/browse/YARN-3049
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Zhijie Shen
> Attachments: YARN-3049-WIP.1.patch, YARN-3049-WIP.2.patch, 
> YARN-3049-WIP.3.patch, YARN-3049-YARN-2928.2.patch, 
> YARN-3049-YARN-2928.3.patch, YARN-3049-YARN-2928.4.patch, 
> YARN-3049-YARN-2928.5.patch, YARN-3049-YARN-2928.6.patch, 
> YARN-3049-YARN-2928.7.patch
>
>
> Implement existing ATS queries with the new ATS reader design.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3049) [Storage Implementation] Implement storage reader interface to fetch raw data from HBase backend

2015-08-06 Thread Zhijie Shen (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14660952#comment-14660952
 ] 

Zhijie Shen commented on YARN-3049:
---

As the issue is not blocking the whole reader implementation, how about letting 
this patch in first? [~sjlee0]?

Some more comments about the issue:

1. ColumnHelper needs to be updated as well to return a byte[] column name 
instead of a String one.

2. I'm worried that Bytes.toString() doesn't make the long integer be stored as 
the way we want. If it isn't stored as the 8 bytes, we may not guarantee the 
order of event columns.

3. FlowRunId in the row key should be fine, because the row key is never 
converted to String again. But it's good to double check.

> [Storage Implementation] Implement storage reader interface to fetch raw data 
> from HBase backend
> 
>
> Key: YARN-3049
> URL: https://issues.apache.org/jira/browse/YARN-3049
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Zhijie Shen
> Attachments: YARN-3049-WIP.1.patch, YARN-3049-WIP.2.patch, 
> YARN-3049-WIP.3.patch, YARN-3049-YARN-2928.2.patch, 
> YARN-3049-YARN-2928.3.patch, YARN-3049-YARN-2928.4.patch, 
> YARN-3049-YARN-2928.5.patch, YARN-3049-YARN-2928.6.patch, 
> YARN-3049-YARN-2928.7.patch
>
>
> Implement existing ATS queries with the new ATS reader design.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3049) [Storage Implementation] Implement storage reader interface to fetch raw data from HBase backend

2015-08-06 Thread Li Lu (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14660935#comment-14660935
 ] 

Li Lu commented on YARN-3049:
-

bq. Also, to Li Lu's point, we should provide an additional api for 
getColumnQualifier which accepts a pre-encoded byte array. It will be in 
addition to the existing api which accepts a String, so that we can use either 
one as applicable. 
+1 for this solution. We can address this in another JIRA so that we're not 
blocking the reader patch? Would you like to take this JIRA [~vrushalic]? If 
not I can do the fix. 

> [Storage Implementation] Implement storage reader interface to fetch raw data 
> from HBase backend
> 
>
> Key: YARN-3049
> URL: https://issues.apache.org/jira/browse/YARN-3049
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Zhijie Shen
> Attachments: YARN-3049-WIP.1.patch, YARN-3049-WIP.2.patch, 
> YARN-3049-WIP.3.patch, YARN-3049-YARN-2928.2.patch, 
> YARN-3049-YARN-2928.3.patch, YARN-3049-YARN-2928.4.patch, 
> YARN-3049-YARN-2928.5.patch, YARN-3049-YARN-2928.6.patch, 
> YARN-3049-YARN-2928.7.patch
>
>
> Implement existing ATS queries with the new ATS reader design.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3049) [Storage Implementation] Implement storage reader interface to fetch raw data from HBase backend

2015-08-06 Thread Vrushali C (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14660923#comment-14660923
 ] 

Vrushali C commented on YARN-3049:
--

It looks like the conversion back to String was done to  avoid additional API 
on store. But if this is causing issues with a long value being the column 
qualifier, I think we should modify/add to the store api to include one which 
accepts a byte array for the compoundColumnQualifier. 

Specifically I think this code should be changed to avoid unnecessary 
conversions between longs to bytes to strings. I thought about changing this in 
my earlier patch but did not think it was causing issues, hence kept it the way 
it was. 

{code}
  byte[] compoundColumnQualifierBytes =
Separator.VALUES.join(columnQualifierWithTsBytes,
Bytes.toBytes(info.getKey()));
// convert back to string to avoid additional API on store.
String compoundColumnQualifier =
Bytes.toString(compoundColumnQualifierBytes);
EntityColumnPrefix.EVENT.store(rowKey, entityTable,
compoundColumnQualifier, null, info.getValue());

{code} 


Also, to [~gtCarrera]'s point, we should provide an additional api for 
getColumnQualifier which accepts a  pre-encoded byte array. It will be in 
addition to the existing api which accepts a String, so that we can use either 
one as applicable. What do you think [~gtCarrera]



> [Storage Implementation] Implement storage reader interface to fetch raw data 
> from HBase backend
> 
>
> Key: YARN-3049
> URL: https://issues.apache.org/jira/browse/YARN-3049
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Zhijie Shen
> Attachments: YARN-3049-WIP.1.patch, YARN-3049-WIP.2.patch, 
> YARN-3049-WIP.3.patch, YARN-3049-YARN-2928.2.patch, 
> YARN-3049-YARN-2928.3.patch, YARN-3049-YARN-2928.4.patch, 
> YARN-3049-YARN-2928.5.patch, YARN-3049-YARN-2928.6.patch, 
> YARN-3049-YARN-2928.7.patch
>
>
> Implement existing ATS queries with the new ATS reader design.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3049) [Storage Implementation] Implement storage reader interface to fetch raw data from HBase backend

2015-08-06 Thread Li Lu (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14660902#comment-14660902
 ] 

Li Lu commented on YARN-3049:
-

A little bit more investigation shows that we're using Strings for column 
qualifier type in our HBase interfaces. They are then encoded into byte arrays 
in getColumnQualifier() helper function. Given the fact that we may want to add 
timestamps in column qualifiers, at least we have the following two solutions:
# Have a getColumnQualifier() helper function that works on pre-encoded byte 
arrays? 
# Change the interface of getColumnQualifier() into byte arrays?

Maybe we have some better options, but so far I'm leaning towards the first 
way, although this makes parsing one column family more tricky. 

Meanwhile, I think the problem is beyond the scope of this JIRA (it's more like 
a whole stack fix rather than the reader itself). Therefore I propose to 
address the problem in a separate JIRA and move forward with the current patch. 

Any comments [~sjlee0] [~jrottinghuis] [~vrushalic]? Thanks! 

> [Storage Implementation] Implement storage reader interface to fetch raw data 
> from HBase backend
> 
>
> Key: YARN-3049
> URL: https://issues.apache.org/jira/browse/YARN-3049
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Zhijie Shen
> Attachments: YARN-3049-WIP.1.patch, YARN-3049-WIP.2.patch, 
> YARN-3049-WIP.3.patch, YARN-3049-YARN-2928.2.patch, 
> YARN-3049-YARN-2928.3.patch, YARN-3049-YARN-2928.4.patch, 
> YARN-3049-YARN-2928.5.patch, YARN-3049-YARN-2928.6.patch, 
> YARN-3049-YARN-2928.7.patch
>
>
> Implement existing ATS queries with the new ATS reader design.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3049) [Storage Implementation] Implement storage reader interface to fetch raw data from HBase backend

2015-08-06 Thread Li Lu (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14660877#comment-14660877
 ] 

Li Lu commented on YARN-3049:
-

Hi [~vrushalic], I think the conversion to string happens on the write code 
path, in YARN-3984, as:
{code}
+  byte[] compoundColumnQualifierBytes =
+  Separator.VALUES.join(columnQualifierWithTsBytes,
+  null);
+  String compoundColumnQualifier =
+  Bytes.toString(compoundColumnQualifierBytes);
+  EntityColumnPrefix.EVENT.store(rowKey, entityTable,
+  compoundColumnQualifier, null, 
TimelineWriterUtils.EMPTY_BYTES);
{code}

Are we sure {{compoundColumnQualifier}} is fine with the attached long values? 
Thanks! 

> [Storage Implementation] Implement storage reader interface to fetch raw data 
> from HBase backend
> 
>
> Key: YARN-3049
> URL: https://issues.apache.org/jira/browse/YARN-3049
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Zhijie Shen
> Attachments: YARN-3049-WIP.1.patch, YARN-3049-WIP.2.patch, 
> YARN-3049-WIP.3.patch, YARN-3049-YARN-2928.2.patch, 
> YARN-3049-YARN-2928.3.patch, YARN-3049-YARN-2928.4.patch, 
> YARN-3049-YARN-2928.5.patch, YARN-3049-YARN-2928.6.patch, 
> YARN-3049-YARN-2928.7.patch
>
>
> Implement existing ATS queries with the new ATS reader design.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3049) [Storage Implementation] Implement storage reader interface to fetch raw data from HBase backend

2015-08-06 Thread Vrushali C (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14660867#comment-14660867
 ] 

Vrushali C commented on YARN-3049:
--

Hi [~zjshen]
In my experience, that kind of conversion between Long to Bytes to String to 
Bytes to Long does not work. When an object is serialized as a Bytes.toBytes 
(Long) , we cannot read it back as a Bytes.toString(). It has to be read back 
as Bytes.toLong(). 

Is there any reason you need to use String to carry values across? Could you 
use byte[] instead and then convert them back as appropriate? 
thanks
Vrushali


> [Storage Implementation] Implement storage reader interface to fetch raw data 
> from HBase backend
> 
>
> Key: YARN-3049
> URL: https://issues.apache.org/jira/browse/YARN-3049
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Zhijie Shen
> Attachments: YARN-3049-WIP.1.patch, YARN-3049-WIP.2.patch, 
> YARN-3049-WIP.3.patch, YARN-3049-YARN-2928.2.patch, 
> YARN-3049-YARN-2928.3.patch, YARN-3049-YARN-2928.4.patch, 
> YARN-3049-YARN-2928.5.patch, YARN-3049-YARN-2928.6.patch, 
> YARN-3049-YARN-2928.7.patch
>
>
> Implement existing ATS queries with the new ATS reader design.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3049) [Storage Implementation] Implement storage reader interface to fetch raw data from HBase backend

2015-08-06 Thread Zhijie Shen (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14660853#comment-14660853
 ] 

Zhijie Shen commented on YARN-3049:
---

Here's a quick example:
{code}
  @Test
  public void test() {
// imitate the process to write a long
Long a = 1234567890L;
byte[] b = Bytes.toBytes(a);
String c = Bytes.toString(b);
// imitate the process to read a long
byte[] d = Bytes.toBytes(c);
Long e = Bytes.toLong(d);
assertEquals(a, e);
  }
{code}
b and d are different bytes, then. Do I use Bytes in a wrong way?

> [Storage Implementation] Implement storage reader interface to fetch raw data 
> from HBase backend
> 
>
> Key: YARN-3049
> URL: https://issues.apache.org/jira/browse/YARN-3049
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Zhijie Shen
> Attachments: YARN-3049-WIP.1.patch, YARN-3049-WIP.2.patch, 
> YARN-3049-WIP.3.patch, YARN-3049-YARN-2928.2.patch, 
> YARN-3049-YARN-2928.3.patch, YARN-3049-YARN-2928.4.patch, 
> YARN-3049-YARN-2928.5.patch, YARN-3049-YARN-2928.6.patch, 
> YARN-3049-YARN-2928.7.patch
>
>
> Implement existing ATS queries with the new ATS reader design.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3049) [Storage Implementation] Implement storage reader interface to fetch raw data from HBase backend

2015-08-06 Thread Sangjin Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14660444#comment-14660444
 ] 

Sangjin Lee commented on YARN-3049:
---

The latest patch (v.7) looks good to me.

Which timestamp are you seeing the issue with? Or is it with any timestamp?

> [Storage Implementation] Implement storage reader interface to fetch raw data 
> from HBase backend
> 
>
> Key: YARN-3049
> URL: https://issues.apache.org/jira/browse/YARN-3049
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Zhijie Shen
> Attachments: YARN-3049-WIP.1.patch, YARN-3049-WIP.2.patch, 
> YARN-3049-WIP.3.patch, YARN-3049-YARN-2928.2.patch, 
> YARN-3049-YARN-2928.3.patch, YARN-3049-YARN-2928.4.patch, 
> YARN-3049-YARN-2928.5.patch, YARN-3049-YARN-2928.6.patch, 
> YARN-3049-YARN-2928.7.patch
>
>
> Implement existing ATS queries with the new ATS reader design.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3049) [Storage Implementation] Implement storage reader interface to fetch raw data from HBase backend

2015-08-05 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14659413#comment-14659413
 ] 

Hadoop QA commented on YARN-3049:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  17m 11s | Findbugs (version ) appears to 
be broken on YARN-2928. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 48s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 49s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 24s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   1m 17s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m 11s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 29s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 40s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   2m 20s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |   0m 24s | Tests passed in 
hadoop-yarn-api. |
| {color:green}+1{color} | yarn tests |   1m 24s | Tests passed in 
hadoop-yarn-server-timelineservice. |
| | |  43m  2s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12748985/YARN-3049-YARN-2928.7.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | YARN-2928 / 895ccfa |
| hadoop-yarn-api test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8779/artifact/patchprocess/testrun_hadoop-yarn-api.txt
 |
| hadoop-yarn-server-timelineservice test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8779/artifact/patchprocess/testrun_hadoop-yarn-server-timelineservice.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8779/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8779/console |


This message was automatically generated.

> [Storage Implementation] Implement storage reader interface to fetch raw data 
> from HBase backend
> 
>
> Key: YARN-3049
> URL: https://issues.apache.org/jira/browse/YARN-3049
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Zhijie Shen
> Attachments: YARN-3049-WIP.1.patch, YARN-3049-WIP.2.patch, 
> YARN-3049-WIP.3.patch, YARN-3049-YARN-2928.2.patch, 
> YARN-3049-YARN-2928.3.patch, YARN-3049-YARN-2928.4.patch, 
> YARN-3049-YARN-2928.5.patch, YARN-3049-YARN-2928.6.patch, 
> YARN-3049-YARN-2928.7.patch
>
>
> Implement existing ATS queries with the new ATS reader design.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3049) [Storage Implementation] Implement storage reader interface to fetch raw data from HBase backend

2015-08-05 Thread Sangjin Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14659256#comment-14659256
 ] 

Sangjin Lee commented on YARN-3049:
---

The latest patch looks good to me overall. Just a couple of comments.

I concur with [~gtCarrera9] that it might be a good idea to create more 
abstract methods around it. Note that we may be writing to other tables at this 
point too. We can even create private helper methods that check whether the 
entity is an application and so on. It's not critical but could be helpful...

Also, in {{HBaseTimelineWriterImpl}}, I see that the app-to-flow table is not 
being flushed. Either we should flush at the end of the write, or add it to the 
{{flush()}} method.

> [Storage Implementation] Implement storage reader interface to fetch raw data 
> from HBase backend
> 
>
> Key: YARN-3049
> URL: https://issues.apache.org/jira/browse/YARN-3049
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Zhijie Shen
> Attachments: YARN-3049-WIP.1.patch, YARN-3049-WIP.2.patch, 
> YARN-3049-WIP.3.patch, YARN-3049-YARN-2928.2.patch, 
> YARN-3049-YARN-2928.3.patch, YARN-3049-YARN-2928.4.patch, 
> YARN-3049-YARN-2928.5.patch, YARN-3049-YARN-2928.6.patch
>
>
> Implement existing ATS queries with the new ATS reader design.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3049) [Storage Implementation] Implement storage reader interface to fetch raw data from HBase backend

2015-08-05 Thread Li Lu (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14659012#comment-14659012
 ] 

Li Lu commented on YARN-3049:
-

Hi [~zjshen], letting HBase implementation locally looks good to me. One minor 
comment for the latest patch is, maybe we want to separate the logic like {{if 
(te.getType().equals(TimelineEntityType.YARN_APPLICATION.toString()))}} in 
HBaseWriterImpl into a separate private method? I think it will be much clearer 
to say something like:
{code}
if (te.getType().equals(TimelineEntityType.YARN_APPLICATION.toString())) {
  updateAppToFlowTable(te);
}
{code}

As [~sjlee0] mentioned above that we may have some other specialization within 
HBaseWriterImpl, so maybe it's helpful to let these special designs stand out? 

> [Storage Implementation] Implement storage reader interface to fetch raw data 
> from HBase backend
> 
>
> Key: YARN-3049
> URL: https://issues.apache.org/jira/browse/YARN-3049
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Zhijie Shen
> Attachments: YARN-3049-WIP.1.patch, YARN-3049-WIP.2.patch, 
> YARN-3049-WIP.3.patch, YARN-3049-YARN-2928.2.patch, 
> YARN-3049-YARN-2928.3.patch, YARN-3049-YARN-2928.4.patch, 
> YARN-3049-YARN-2928.5.patch, YARN-3049-YARN-2928.6.patch
>
>
> Implement existing ATS queries with the new ATS reader design.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3049) [Storage Implementation] Implement storage reader interface to fetch raw data from HBase backend

2015-08-04 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14654737#comment-14654737
 ] 

Hadoop QA commented on YARN-3049:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  17m 34s | Findbugs (version ) appears to 
be broken on YARN-2928. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   8m  0s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 58s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   1m 19s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  9s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 26s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 40s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   2m 22s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |   0m 23s | Tests passed in 
hadoop-yarn-api. |
| {color:green}+1{color} | yarn tests |   1m 24s | Tests passed in 
hadoop-yarn-server-timelineservice. |
| | |  43m 44s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12748758/YARN-3049-YARN-2928.6.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | YARN-2928 / bf65663 |
| hadoop-yarn-api test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8767/artifact/patchprocess/testrun_hadoop-yarn-api.txt
 |
| hadoop-yarn-server-timelineservice test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8767/artifact/patchprocess/testrun_hadoop-yarn-server-timelineservice.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8767/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8767/console |


This message was automatically generated.

> [Storage Implementation] Implement storage reader interface to fetch raw data 
> from HBase backend
> 
>
> Key: YARN-3049
> URL: https://issues.apache.org/jira/browse/YARN-3049
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Zhijie Shen
> Attachments: YARN-3049-WIP.1.patch, YARN-3049-WIP.2.patch, 
> YARN-3049-WIP.3.patch, YARN-3049-YARN-2928.2.patch, 
> YARN-3049-YARN-2928.3.patch, YARN-3049-YARN-2928.4.patch, 
> YARN-3049-YARN-2928.5.patch, YARN-3049-YARN-2928.6.patch
>
>
> Implement existing ATS queries with the new ATS reader design.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3049) [Storage Implementation] Implement storage reader interface to fetch raw data from HBase backend

2015-08-04 Thread Sangjin Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14654341#comment-14654341
 ] 

Sangjin Lee commented on YARN-3049:
---

{quote}
I'm trying to understand the discussion here. Yes, and what we worked quite 
hard to avoid is to identify the types of the incoming entities, in a writer, 
so that we can apply different write code paths. If this is the case, maybe we 
can refactor the write method so that it contains an expandable context object? 
We can easily encapsulate flags in a BitSet-like object, and we may add more if 
needed. The only problem I'm wondering about is, is it possible for the caller 
to easily generate a context with all required information (such as isNewApp or 
appFinish)?
BTW, I believe we need to refactor the interface of the read and write methods 
to use some sorts of contexts anyways. Our current argument lists are not 
expandable. So if this helps, maybe we can move forward by refactor the write 
interfaces?
{quote}

Another place where {{HBaseTimelineWriterImpl}} would check for the entity type 
(being the application) is splitting the application table (YARN-3906). The 
current patch checks the type of the entity to be able to send writes to 
different tables. So that would need to be included in the discussion as well.

I completely understand the desire that we want to make writers as much 
agnostic about entity types and data as possible. However, since a lot of 
things in the schema need to be based on the applications (flow context, the 
application table, flow run aggregation, etc.), the need to support that 
strongly is real. We can either go the route of having the write recognize 
applications and some of their events strongly (at the expense of making the 
separation between entities and writers a little weaker), or try to create a 
context for this decision (as [~gtCarrera9] suggested) and have the writer act 
on it.

As for the latter option, while it still shields the writer from knowing 
details about entities, it would still need to know similar attributes (e.g. 
"application created", "whether the entity is an application", etc.), only in a 
more passive manner.

Thoughts?

> [Storage Implementation] Implement storage reader interface to fetch raw data 
> from HBase backend
> 
>
> Key: YARN-3049
> URL: https://issues.apache.org/jira/browse/YARN-3049
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Zhijie Shen
> Attachments: YARN-3049-WIP.1.patch, YARN-3049-WIP.2.patch, 
> YARN-3049-WIP.3.patch, YARN-3049-YARN-2928.2.patch, 
> YARN-3049-YARN-2928.3.patch, YARN-3049-YARN-2928.4.patch, 
> YARN-3049-YARN-2928.5.patch
>
>
> Implement existing ATS queries with the new ATS reader design.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3049) [Storage Implementation] Implement storage reader interface to fetch raw data from HBase backend

2015-08-03 Thread Sangjin Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14652854#comment-14652854
 ] 

Sangjin Lee commented on YARN-3049:
---

{quote}
Then, we uniformly process the entities no matter what their type is. What we 
discussed so far implies that we cannot only treat the entities so generally. 
For application entity, we may need to take an additional step to parse its 
start/finish event to write more records.
{quote}

I understand that we want to do that as much as possible. However, we made 
several calls in terms of schema that call out apps pretty explicitly, and to 
implement that some amount of special treatment of the application entities is 
required. For example, the app-to-flow table is already a special table for 
applications. Similarly, real-time aggregation takes values from application 
entities to the flow run level. I don't think it's as bad as it might sound.

> [Storage Implementation] Implement storage reader interface to fetch raw data 
> from HBase backend
> 
>
> Key: YARN-3049
> URL: https://issues.apache.org/jira/browse/YARN-3049
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Zhijie Shen
> Attachments: YARN-3049-WIP.1.patch, YARN-3049-WIP.2.patch, 
> YARN-3049-WIP.3.patch, YARN-3049-YARN-2928.2.patch, 
> YARN-3049-YARN-2928.3.patch, YARN-3049-YARN-2928.4.patch, 
> YARN-3049-YARN-2928.5.patch
>
>
> Implement existing ATS queries with the new ATS reader design.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3049) [Storage Implementation] Implement storage reader interface to fetch raw data from HBase backend

2015-08-03 Thread Li Lu (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14652850#comment-14652850
 ] 

Li Lu commented on YARN-3049:
-

bq. What we discussed so far implies that we cannot only treat the entities so 
generally. For application entity, we may need to take an additional step to 
parse its start/finish event to write more records.

I'm trying to understand the discussion here. Yes, and what we worked quite 
hard to avoid is to identify the types of the incoming entities, in a writer, 
so that we can apply different write code paths. If this is the case, maybe we 
can refactor the write method so that it contains an expandable context object? 
We can easily encapsulate flags in a BitSet-like object, and we may add more if 
needed. The only problem I'm wondering about is, is it possible for the caller 
to easily generate a context with all required information (such as isNewApp or 
appFinish)? 

BTW, I believe we need to refactor the interface of the read and write methods 
to use some sorts of contexts anyways. Our current argument lists are not 
expandable. So if this helps, maybe we can move forward by refactor the write 
interfaces? 

> [Storage Implementation] Implement storage reader interface to fetch raw data 
> from HBase backend
> 
>
> Key: YARN-3049
> URL: https://issues.apache.org/jira/browse/YARN-3049
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Zhijie Shen
> Attachments: YARN-3049-WIP.1.patch, YARN-3049-WIP.2.patch, 
> YARN-3049-WIP.3.patch, YARN-3049-YARN-2928.2.patch, 
> YARN-3049-YARN-2928.3.patch, YARN-3049-YARN-2928.4.patch, 
> YARN-3049-YARN-2928.5.patch
>
>
> Implement existing ATS queries with the new ATS reader design.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3049) [Storage Implementation] Implement storage reader interface to fetch raw data from HBase backend

2015-08-03 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14652823#comment-14652823
 ] 

Hadoop QA commented on YARN-3049:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  21m 18s | Findbugs (version ) appears to 
be broken on YARN-2928. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 3 new or modified test files. |
| {color:green}+1{color} | javac |  10m 45s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  11m 23s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 25s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   1m 51s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m 14s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 31s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 41s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   3m 51s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |   0m 24s | Tests passed in 
hadoop-yarn-api. |
| {color:red}-1{color} | yarn tests |  53m  3s | Tests failed in 
hadoop-yarn-server-resourcemanager. |
| {color:red}-1{color} | yarn tests |   0m 24s | Tests failed in 
hadoop-yarn-server-timelineservice. |
| | | 105m 56s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | 
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestApplicationPriority |
| Failed build | hadoop-yarn-server-timelineservice |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12748542/YARN-3049-YARN-2928.5.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | YARN-2928 / df0ec47 |
| hadoop-yarn-api test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8754/artifact/patchprocess/testrun_hadoop-yarn-api.txt
 |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8754/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| hadoop-yarn-server-timelineservice test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8754/artifact/patchprocess/testrun_hadoop-yarn-server-timelineservice.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8754/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8754/console |


This message was automatically generated.

> [Storage Implementation] Implement storage reader interface to fetch raw data 
> from HBase backend
> 
>
> Key: YARN-3049
> URL: https://issues.apache.org/jira/browse/YARN-3049
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Zhijie Shen
> Attachments: YARN-3049-WIP.1.patch, YARN-3049-WIP.2.patch, 
> YARN-3049-WIP.3.patch, YARN-3049-YARN-2928.2.patch, 
> YARN-3049-YARN-2928.3.patch, YARN-3049-YARN-2928.4.patch, 
> YARN-3049-YARN-2928.5.patch
>
>
> Implement existing ATS queries with the new ATS reader design.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3049) [Storage Implementation] Implement storage reader interface to fetch raw data from HBase backend

2015-08-03 Thread Zhijie Shen (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14652806#comment-14652806
 ] 

Zhijie Shen commented on YARN-3049:
---

Okay, what will the timestamp be used to do? If there're too much context info 
required, I agree it's not elegant to incrementally expose them to the backend.

One step back, I start to understand that the real situation actually deviates 
from what I originally thought about the storage layer. When defining the data 
model, I defined a generic TimelineEntity and make other first-class citizen 
entities extend it. Then, we uniformly process the entities no matter what 
their type is. What we discussed so far implies that we cannot only treat the 
entities so generally. For application entity, we may need to take an 
additional step to parse its start/finish event to write more records.

> [Storage Implementation] Implement storage reader interface to fetch raw data 
> from HBase backend
> 
>
> Key: YARN-3049
> URL: https://issues.apache.org/jira/browse/YARN-3049
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Zhijie Shen
> Attachments: YARN-3049-WIP.1.patch, YARN-3049-WIP.2.patch, 
> YARN-3049-WIP.3.patch, YARN-3049-YARN-2928.2.patch, 
> YARN-3049-YARN-2928.3.patch, YARN-3049-YARN-2928.4.patch, 
> YARN-3049-YARN-2928.5.patch
>
>
> Implement existing ATS queries with the new ATS reader design.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3049) [Storage Implementation] Implement storage reader interface to fetch raw data from HBase backend

2015-08-03 Thread Sangjin Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14652709#comment-14652709
 ] 

Sangjin Lee commented on YARN-3049:
---

I like that approach better than the previous. Thanks for the update.

How would we be able to handle the "app finished" event? That needs to be 
supported too for other tables, and adding another flag to the context doesn't 
seem too appealing? Also, the timestamp of these events are important as they 
need to be written to some secondary tables. How can we captured them? If 
{{HBaseTimelineWriterImpl}} needs to recognize and read the event timestamp, 
then we might as well just look for those events, right? Any thoughts on these?

> [Storage Implementation] Implement storage reader interface to fetch raw data 
> from HBase backend
> 
>
> Key: YARN-3049
> URL: https://issues.apache.org/jira/browse/YARN-3049
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Zhijie Shen
> Attachments: YARN-3049-WIP.1.patch, YARN-3049-WIP.2.patch, 
> YARN-3049-WIP.3.patch, YARN-3049-YARN-2928.2.patch, 
> YARN-3049-YARN-2928.3.patch, YARN-3049-YARN-2928.4.patch, 
> YARN-3049-YARN-2928.5.patch
>
>
> Implement existing ATS queries with the new ATS reader design.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3049) [Storage Implementation] Implement storage reader interface to fetch raw data from HBase backend

2015-08-03 Thread Zhijie Shen (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14652206#comment-14652206
 ] 

Zhijie Shen commented on YARN-3049:
---

Hi Sangjin,


Thanks for your comments. The proposed method will work for now and can 
minimize the change we should make. In fact, I used to think of this method 
too. The reason why I abandoned it is that the method couple the business logic 
and data storage. It potentially increase the risk that the change in the 
business logic will break the storage layer. For example, we rename app_created 
as app_started. This may be still easy to fix, but the maintenance difficulty 
is likely to increase as logic grows more complex. That's why I think we should 
let app collector to tell the backend that it's the first request.


On the other side, I agree RM should be responsible for this too. Actually this 
is also what I did in the current patch. If you think my proposal of letting 
app collector to determine if it is the first request, the way we can do is to 
extend RM app collector and implement this logic there.


Thanks,

Zhijie



> [Storage Implementation] Implement storage reader interface to fetch raw data 
> from HBase backend
> 
>
> Key: YARN-3049
> URL: https://issues.apache.org/jira/browse/YARN-3049
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Zhijie Shen
> Attachments: YARN-3049-WIP.1.patch, YARN-3049-WIP.2.patch, 
> YARN-3049-WIP.3.patch, YARN-3049-YARN-2928.2.patch, 
> YARN-3049-YARN-2928.3.patch, YARN-3049-YARN-2928.4.patch
>
>
> Implement existing ATS queries with the new ATS reader design.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3049) [Storage Implementation] Implement storage reader interface to fetch raw data from HBase backend

2015-08-03 Thread Sangjin Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14652140#comment-14652140
 ] 

Sangjin Lee commented on YARN-3049:
---

When {{HBaseTimelineWriterImpl}} processes events for writes, it could have a 
rule for those couple of special events (identified by the entity type = "yarn 
application", event type = "application created" or "application finished"), 
and trigger those events, right? I understand that it is bit unnatural for 
{{HBaseTimelineWriterImpl}} to recognize those events explicitly, but that 
could make this self-contained, right?

I think this is a rather important point because there are more tables that 
need to be written to on application creation and completed and also more data 
than the flow context. For example, the schema proposal calls for writing the 
application start time and the application end time upon receiving those events 
among others. We want to have a single point where all these are done. See 
https://issues.apache.org/jira/secure/attachment/12743391/hbase-schema-proposal-for-aggregation.pdf
 for more details.

I also think that doing it via the RM timeline collector is probably the best 
for this. The RM timeline collector is the one that's writing these events to 
begin with, and it can do that without worrying about the *app* timeline 
collector starting up in time, etc. Thoughts?

> [Storage Implementation] Implement storage reader interface to fetch raw data 
> from HBase backend
> 
>
> Key: YARN-3049
> URL: https://issues.apache.org/jira/browse/YARN-3049
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Zhijie Shen
> Attachments: YARN-3049-WIP.1.patch, YARN-3049-WIP.2.patch, 
> YARN-3049-WIP.3.patch, YARN-3049-YARN-2928.2.patch, 
> YARN-3049-YARN-2928.3.patch, YARN-3049-YARN-2928.4.patch
>
>
> Implement existing ATS queries with the new ATS reader design.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3049) [Storage Implementation] Implement storage reader interface to fetch raw data from HBase backend

2015-07-31 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14650114#comment-14650114
 ] 

Hadoop QA commented on YARN-3049:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  21m 48s | Findbugs (version ) appears to 
be broken on YARN-2928. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 3 new or modified test files. |
| {color:green}+1{color} | javac |  11m 50s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  12m 25s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 24s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   2m  5s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m 16s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 48s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 52s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   4m 29s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |   0m 26s | Tests passed in 
hadoop-yarn-api. |
| {color:red}-1{color} | yarn tests |  53m 17s | Tests failed in 
hadoop-yarn-server-resourcemanager. |
| {color:green}+1{color} | yarn tests |   1m 26s | Tests passed in 
hadoop-yarn-server-timelineservice. |
| | | 111m 15s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | 
hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions |
| Timed out tests | 
org.apache.hadoop.yarn.server.resourcemanager.TestWorkPreservingRMRestart |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12748249/YARN-3049-YARN-2928.4.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | YARN-2928 / df0ec47 |
| hadoop-yarn-api test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8741/artifact/patchprocess/testrun_hadoop-yarn-api.txt
 |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8741/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| hadoop-yarn-server-timelineservice test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8741/artifact/patchprocess/testrun_hadoop-yarn-server-timelineservice.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8741/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8741/console |


This message was automatically generated.

> [Storage Implementation] Implement storage reader interface to fetch raw data 
> from HBase backend
> 
>
> Key: YARN-3049
> URL: https://issues.apache.org/jira/browse/YARN-3049
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Zhijie Shen
> Attachments: YARN-3049-WIP.1.patch, YARN-3049-WIP.2.patch, 
> YARN-3049-WIP.3.patch, YARN-3049-YARN-2928.2.patch, 
> YARN-3049-YARN-2928.3.patch, YARN-3049-YARN-2928.4.patch
>
>
> Implement existing ATS queries with the new ATS reader design.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3049) [Storage Implementation] Implement storage reader interface to fetch raw data from HBase backend

2015-07-31 Thread Zhijie Shen (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14650106#comment-14650106
 ] 

Zhijie Shen commented on YARN-3049:
---

What I meant before is that HBaseTimelineWriterImpl is not aware of a life 
cycle/session of the application, such that it's hard to detect the app 
creation event inside HBaseTimelineWriterImpl and make it transparent the 
caller. Instead, app collector can know if it is the first put request for this 
app sent to the writer.

> [Storage Implementation] Implement storage reader interface to fetch raw data 
> from HBase backend
> 
>
> Key: YARN-3049
> URL: https://issues.apache.org/jira/browse/YARN-3049
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Zhijie Shen
> Attachments: YARN-3049-WIP.1.patch, YARN-3049-WIP.2.patch, 
> YARN-3049-WIP.3.patch, YARN-3049-YARN-2928.2.patch, 
> YARN-3049-YARN-2928.3.patch, YARN-3049-YARN-2928.4.patch
>
>
> Implement existing ATS queries with the new ATS reader design.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3049) [Storage Implementation] Implement storage reader interface to fetch raw data from HBase backend

2015-07-31 Thread Sangjin Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14650075#comment-14650075
 ] 

Sangjin Lee commented on YARN-3049:
---

I thought that the application created event would be written by the RM (and 
its embedded collector), no? So I'm not sure if the writer (for the app 
timeline collector) being bound to the session of the application is an issue. 
Maybe I misunderstood your comment?

In essence, the flow of control that I was thinking of is not really different 
than your v.3 patch. My point was more about the way we're passing that 
information. I think it should be possible from inside 
{{HBaseTimelineWriterImpl}} to detect that it received an application created 
event (likely originating from RM) and trigger writing to these tables.

Also, note that we want to store the application created timestamp, and also 
application finished event along with its timestamp. That's not for this table 
but for other tables that are mentioned in the schema proposal doc. To be able 
to do these as well, it would be most natural to do it based on seeing these 
events.

Thoughts?

> [Storage Implementation] Implement storage reader interface to fetch raw data 
> from HBase backend
> 
>
> Key: YARN-3049
> URL: https://issues.apache.org/jira/browse/YARN-3049
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Zhijie Shen
> Attachments: YARN-3049-WIP.1.patch, YARN-3049-WIP.2.patch, 
> YARN-3049-WIP.3.patch, YARN-3049-YARN-2928.2.patch, 
> YARN-3049-YARN-2928.3.patch, YARN-3049-YARN-2928.4.patch
>
>
> Implement existing ATS queries with the new ATS reader design.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3049) [Storage Implementation] Implement storage reader interface to fetch raw data from HBase backend

2015-07-31 Thread Zhijie Shen (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14649915#comment-14649915
 ] 

Zhijie Shen commented on YARN-3049:
---

I uploaded a new patch to address Sangjin's comments except bellow:

bq. l.93: What does it mean to indicate newApp for a set of entities? What if 
the set of entities contains bunch of different applications?

I don't worry about this, because the the put request to the app collector is 
related to the same app.

bq. See comments above; rather than relying on the boolean flag in the 
arguments, can we detect the case of the application created event and do it?

See my comments above.

> [Storage Implementation] Implement storage reader interface to fetch raw data 
> from HBase backend
> 
>
> Key: YARN-3049
> URL: https://issues.apache.org/jira/browse/YARN-3049
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Zhijie Shen
> Attachments: YARN-3049-WIP.1.patch, YARN-3049-WIP.2.patch, 
> YARN-3049-WIP.3.patch, YARN-3049-YARN-2928.2.patch, 
> YARN-3049-YARN-2928.3.patch, YARN-3049-YARN-2928.4.patch
>
>
> Implement existing ATS queries with the new ATS reader design.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3049) [Storage Implementation] Implement storage reader interface to fetch raw data from HBase backend

2015-07-31 Thread Zhijie Shen (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14649770#comment-14649770
 ] 

Zhijie Shen commented on YARN-3049:
---

[~sjlee0], yeah, I agree it's not a decent solution to let the user code to 
trigger writing the app to flow mapping. The reason why I did this before is 
that we can avoid check and put for each individual entity put request, which 
will obviously slow dow the write path.  Detecting the application created 
event sounds a reasonable option.  However, I'm afraid we cannot hide it inside 
the writer as the implementation detail, because the writer is bind to the 
session of an application. One solution I can think of is tackling the session 
start in the app collector. Upon the first put request received by the app 
collector, we tell the writer to also write the app to flow mapping. What do 
you think?

> [Storage Implementation] Implement storage reader interface to fetch raw data 
> from HBase backend
> 
>
> Key: YARN-3049
> URL: https://issues.apache.org/jira/browse/YARN-3049
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Zhijie Shen
> Attachments: YARN-3049-WIP.1.patch, YARN-3049-WIP.2.patch, 
> YARN-3049-WIP.3.patch, YARN-3049-YARN-2928.2.patch, 
> YARN-3049-YARN-2928.3.patch
>
>
> Implement existing ATS queries with the new ATS reader design.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3049) [Storage Implementation] Implement storage reader interface to fetch raw data from HBase backend

2015-07-31 Thread Sangjin Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14649658#comment-14649658
 ] 

Sangjin Lee commented on YARN-3049:
---

Sorry [~zjshen] it took me a while to get to this. The patch looks pretty good 
actually. I have one high level point I'd like to discuss with you, and several 
smaller comments.

I see that you added a new boolean argument in 
{{TimelineCollector.putEntity()}}, {{TimelineCollector.putEntities()}}, and 
{{TimelineWriter.write()}} to indicate we're dealing with a new app (and thus 
writing to the app-to-flow table). I'm not sure whether that is really what we 
want to do. Can we not detect and leverage the fact that we're dealing with an 
"application created" event and trigger those actions instead of having an 
explicit argument that gets passed down all the way from the clients? First, in 
this approach we would be completely relying on the client code to specify this 
correctly. Secondly, I would argue that the fact that we need to detect that 
we're introducing a new application and write to these tables is somewhat of an 
"implementation detail" of the HBase writer. For example, other writers may not 
even care about that and have no need for it. The fact that this detail leaks 
all the way to the callers is awkward at best.

My initial thinking of how to do this was inside {{HBaseTimelineWriterImpl}} on 
detecting the application created event to trigger this action. What do you 
think?

(TimelineEntity.java)
- l.138: it might be better to use the type {{SortedSet}} or {{NavigableSet}} 
to make it explicit we want ordering

(TimelineCollector.java)
- l.93: What does it mean to indicate newApp for a set of entities? What if the 
set of entities contains bunch of different applications?

(HBaseTimelineWriterImpl.java)
- See comments above; rather than relying on the boolean flag in the arguments, 
can we detect the case of the application created event and do it?

(ColumnPrefix.java)
- l.67: nit: I think the word "from" is needed there. It's just that the space 
was missing between "result" and "from".

(TimelineReaderUtils.java)
- l.33: nit: "both matches" -> "both match"


> [Storage Implementation] Implement storage reader interface to fetch raw data 
> from HBase backend
> 
>
> Key: YARN-3049
> URL: https://issues.apache.org/jira/browse/YARN-3049
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Zhijie Shen
> Attachments: YARN-3049-WIP.1.patch, YARN-3049-WIP.2.patch, 
> YARN-3049-WIP.3.patch, YARN-3049-YARN-2928.2.patch, 
> YARN-3049-YARN-2928.3.patch
>
>
> Implement existing ATS queries with the new ATS reader design.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3049) [Storage Implementation] Implement storage reader interface to fetch raw data from HBase backend

2015-07-30 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14648336#comment-14648336
 ] 

Hadoop QA commented on YARN-3049:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  17m  2s | Findbugs (version ) appears to 
be broken on YARN-2928. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 3 new or modified test files. |
| {color:green}+1{color} | javac |   7m 48s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 45s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   1m 42s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m 16s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 25s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 41s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   3m 46s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |   0m 23s | Tests passed in 
hadoop-yarn-api. |
| {color:green}+1{color} | yarn tests |  53m  2s | Tests passed in 
hadoop-yarn-server-resourcemanager. |
| {color:green}+1{color} | yarn tests |   1m 24s | Tests passed in 
hadoop-yarn-server-timelineservice. |
| | |  97m 43s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12748046/YARN-3049-YARN-2928.3.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | YARN-2928 / df0ec47 |
| hadoop-yarn-api test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8722/artifact/patchprocess/testrun_hadoop-yarn-api.txt
 |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8722/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| hadoop-yarn-server-timelineservice test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8722/artifact/patchprocess/testrun_hadoop-yarn-server-timelineservice.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8722/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8722/console |


This message was automatically generated.

> [Storage Implementation] Implement storage reader interface to fetch raw data 
> from HBase backend
> 
>
> Key: YARN-3049
> URL: https://issues.apache.org/jira/browse/YARN-3049
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Zhijie Shen
> Attachments: YARN-3049-WIP.1.patch, YARN-3049-WIP.2.patch, 
> YARN-3049-WIP.3.patch, YARN-3049-YARN-2928.2.patch, 
> YARN-3049-YARN-2928.3.patch
>
>
> Implement existing ATS queries with the new ATS reader design.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3049) [Storage Implementation] Implement storage reader interface to fetch raw data from HBase backend

2015-07-30 Thread Li Lu (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14648243#comment-14648243
 ] 

Li Lu commented on YARN-3049:
-

Hi [~zjshen]! Some of my comments:

bq. As I see a lot of arguments for the reader interface (as well as the writer 
one) and the potential signature change in future (e.g, adding newApp in this 
patch), I start to think of grouping the primitive arguments, shielding them in 
some category object, such as EntityContext, EntityFilters, Opts and so on, and 
using these as the arguments of the interface instead. 

I agree. Actually I spent quite some time wondering if we really need to add 
the {{newApp}} argument in this patch. Encapsulating all related information 
into a category object appears to be a nice way to avoid future interface 
changes. +1. 

bq. Given it may be a non-trivial work, can we get this patch in and follow up 
the filter change in another jira just in case?

Definitely. Let's consolidate the whole workflow first. Then we can start these 
improvements. 

bq. In fact, it has been tested. I change the write path by letting newApp = 
true, and check if we can query the entity successfully without giving the 
flow/flowRun explicitly. However, I didn't do much assertion around the fields 
of retrieved entities, because I consider of deferring this work together with 
rewriting the whole HBase backend unit test.

Sounds good to me. 

> [Storage Implementation] Implement storage reader interface to fetch raw data 
> from HBase backend
> 
>
> Key: YARN-3049
> URL: https://issues.apache.org/jira/browse/YARN-3049
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Zhijie Shen
> Attachments: YARN-3049-WIP.1.patch, YARN-3049-WIP.2.patch, 
> YARN-3049-WIP.3.patch, YARN-3049-YARN-2928.2.patch, 
> YARN-3049-YARN-2928.3.patch
>
>
> Implement existing ATS queries with the new ATS reader design.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3049) [Storage Implementation] Implement storage reader interface to fetch raw data from HBase backend

2015-07-30 Thread Zhijie Shen (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14648218#comment-14648218
 ] 

Zhijie Shen commented on YARN-3049:
---

[~gtCarrera9], thanks for review. I've addressed most of your comments in the 
new patch exception followings:

bq. However, I still incline to proceed the changes in this JIRA so that we can 
speed up consolidating our POC patches.

Exactly.

bq. Reader interface: use TimelineCollectorContext to package reader arguments?

Yeah, I can see the rationale behind it, but maybe it's not 
TimelineCollectorContext. As I see a lot of arguments for the reader interface 
(as well as the writer one) and the potential signature change in future (e.g, 
adding newApp in this patch), I start to think of grouping the primitive 
arguments, shielding them in some category object, such as EntityContext, 
EntityFilters, Opts and so on, and using these as the arguments of the 
interface instead. Therefore, if we want to add newApp here, we don't really 
need to change the method signature, but add a getter/setter in Opts. Please 
let me know how you think about the idea. I can file another jira to deal with 
the issue.

bq. We're now performing filters by ourselves in memory. I'm wondering if it 
will be more efficient to translate some of our filter specifications into 
HBase filters?

That sounds a good idea, which should potentially improve the read performance. 
Let me do some investigation how to map our filter into HBase filter and push 
it to the backend. Given it may be a non-trivial work, can we get this patch in 
and follow up the filter change in another jira just in case?

bq. Add a specific test in TestHBaseTimelineWriterImpl for App2FlowTable?

In fact, it has been tested. I change the write path by letting newApp = true, 
and check if we can query the entity successfully without giving the 
flow/flowRun explicitly. However, I didn't do much assertion around the fields 
of retrieved entities, because I consider of deferring this work together with 
rewriting the whole HBase backend unit test. The current tests are too 
preliminary to capture the potential bugs around DB operations.

> [Storage Implementation] Implement storage reader interface to fetch raw data 
> from HBase backend
> 
>
> Key: YARN-3049
> URL: https://issues.apache.org/jira/browse/YARN-3049
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Zhijie Shen
> Attachments: YARN-3049-WIP.1.patch, YARN-3049-WIP.2.patch, 
> YARN-3049-WIP.3.patch, YARN-3049-YARN-2928.2.patch, 
> YARN-3049-YARN-2928.3.patch
>
>
> Implement existing ATS queries with the new ATS reader design.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3049) [Storage Implementation] Implement storage reader interface to fetch raw data from HBase backend

2015-07-29 Thread Li Lu (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14646893#comment-14646893
 ] 

Li Lu commented on YARN-3049:
-

Hi [~zjshen], some of my comments:

- The addition on {{newApp}} is to indicate if we need if we need to update the 
app2flow index table. This change is an interface change and it's slightly more 
than I thought. However, I still incline to proceed the changes in this JIRA so 
that we can speed up consolidating our POC patches. 

- FileSystemTimelineReaderImpl, in {{fillFields}}, maybe we can use 
EnumSet.allOf() to generate the universe of fields so that we can reuse the 
logic of the following for loop for Field.ALL? 

- Reader interface: use TimelineCollectorContext to package reader arguments?

- HBaseTimelineReaderImpl:
l.160 (all line numbers are after patch)
{code}
byte[] row = result.getRow();
{code}
unused? 

l.213 name of private method {{getEntity}}: I think we may want to distinguish 
that with the external {{getEntity}} API. How about parseEntity or 
getEntitiFromResult? 

We're now performing filters by ourselves in memory. I'm wondering if it will 
be more efficient to translate some of our filter specifications into HBase 
filters? 

l.113, 136, 142: I'm a little bit worry about the {{0L}}s. Shall we have 
something like DEFAULT_TIME to make the argument list more readable? 

I assume the problem raised in l.369 ("if the event come with no info, it will 
be missed") will be addressed after YARN-3984? 

- HBaseTimelineWriterImpl:
l.121-122: The log information is unclear about the write happened onto the 
App2Flow table? Also, we may want to keep this message in debug level?

- TimelineSchemaCreator:
Why we are not adding {{a2f}} as an option, similar to what we did in l.94-102 
for {{e}} and {{m}}?

- App2FlowColumn:
l.51, {{private}} appears to be redundant in enums. Similarly in l.42 or 
App2FlowColumnFamily. 

nits: 
- Name of App2FlowTable, AppToFlowTable? Saving one character every time is not 
quite helpful...

- l. 248, 263, 336: I'm confused by the name readConnections...

- Add a specific test in TestHBaseTimelineWriterImpl for App2FlowTable? 

> [Storage Implementation] Implement storage reader interface to fetch raw data 
> from HBase backend
> 
>
> Key: YARN-3049
> URL: https://issues.apache.org/jira/browse/YARN-3049
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Zhijie Shen
> Attachments: YARN-3049-WIP.1.patch, YARN-3049-WIP.2.patch, 
> YARN-3049-WIP.3.patch, YARN-3049-YARN-2928.2.patch
>
>
> Implement existing ATS queries with the new ATS reader design.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3049) [Storage Implementation] Implement storage reader interface to fetch raw data from HBase backend

2015-07-29 Thread Li Lu (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14646465#comment-14646465
 ] 

Li Lu commented on YARN-3049:
-

Thanks [~zjshen]! For now I think it's fine to include the changes on app2flow 
table. I'll take a look at your latest patch. 

> [Storage Implementation] Implement storage reader interface to fetch raw data 
> from HBase backend
> 
>
> Key: YARN-3049
> URL: https://issues.apache.org/jira/browse/YARN-3049
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Zhijie Shen
> Attachments: YARN-3049-WIP.1.patch, YARN-3049-WIP.2.patch, 
> YARN-3049-WIP.3.patch, YARN-3049-YARN-2928.2.patch
>
>
> Implement existing ATS queries with the new ATS reader design.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3049) [Storage Implementation] Implement storage reader interface to fetch raw data from HBase backend

2015-07-29 Thread Zhijie Shen (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14646339#comment-14646339
 ] 

Zhijie Shen commented on YARN-3049:
---

TestApplicationPriority.testApplicationPriorityAllocation seems to have a race 
condition issue. I cannot reproduce it locally both on trunk or with on 
YARN-2928 with this patch. Anyway, it seems not to be related to this jira. 
Will file a separate Jira to track the test failure.

> [Storage Implementation] Implement storage reader interface to fetch raw data 
> from HBase backend
> 
>
> Key: YARN-3049
> URL: https://issues.apache.org/jira/browse/YARN-3049
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Zhijie Shen
> Attachments: YARN-3049-WIP.1.patch, YARN-3049-WIP.2.patch, 
> YARN-3049-WIP.3.patch, YARN-3049-YARN-2928.2.patch
>
>
> Implement existing ATS queries with the new ATS reader design.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3049) [Storage Implementation] Implement storage reader interface to fetch raw data from HBase backend

2015-07-28 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14645444#comment-14645444
 ] 

Hadoop QA commented on YARN-3049:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  18m  5s | Findbugs (version ) appears to 
be broken on YARN-2928. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 3 new or modified test files. |
| {color:green}+1{color} | javac |   8m  2s | There were no new javac warning 
messages. |
| {color:red}-1{color} | javadoc |  10m 10s | The applied patch generated  5  
additional warning messages. |
| {color:green}+1{color} | release audit |   0m 25s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   1m 44s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m 11s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 27s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 41s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   3m 50s | The patch appears to introduce 7 
new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |   0m 23s | Tests passed in 
hadoop-yarn-api. |
| {color:red}-1{color} | yarn tests |  53m 13s | Tests failed in 
hadoop-yarn-server-resourcemanager. |
| {color:green}+1{color} | yarn tests |   1m 26s | Tests passed in 
hadoop-yarn-server-timelineservice. |
| | |  99m 47s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-yarn-server-resourcemanager |
| Failed unit tests | 
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestApplicationPriority |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12747693/YARN-3049-YARN-2928.2.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | YARN-2928 / df0ec47 |
| javadoc | 
https://builds.apache.org/job/PreCommit-YARN-Build/8701/artifact/patchprocess/diffJavadocWarnings.txt
 |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-YARN-Build/8701/artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
 |
| hadoop-yarn-api test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8701/artifact/patchprocess/testrun_hadoop-yarn-api.txt
 |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8701/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| hadoop-yarn-server-timelineservice test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8701/artifact/patchprocess/testrun_hadoop-yarn-server-timelineservice.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8701/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8701/console |


This message was automatically generated.

> [Storage Implementation] Implement storage reader interface to fetch raw data 
> from HBase backend
> 
>
> Key: YARN-3049
> URL: https://issues.apache.org/jira/browse/YARN-3049
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Zhijie Shen
> Attachments: YARN-3049-WIP.1.patch, YARN-3049-WIP.2.patch, 
> YARN-3049-WIP.3.patch, YARN-3049-YARN-2928.2.patch
>
>
> Implement existing ATS queries with the new ATS reader design.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3049) [Storage Implementation] Implement storage reader interface to fetch raw data from HBase backend

2015-07-28 Thread Li Lu (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14645196#comment-14645196
 ] 

Li Lu commented on YARN-3049:
-

Given the progress on YARN-3949, shall we focus back onto this JIRA now? IIUC 
we can also build offline readers on top of this JIRA. 

> [Storage Implementation] Implement storage reader interface to fetch raw data 
> from HBase backend
> 
>
> Key: YARN-3049
> URL: https://issues.apache.org/jira/browse/YARN-3049
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Zhijie Shen
> Attachments: YARN-3049-WIP.1.patch, YARN-3049-WIP.2.patch, 
> YARN-3049-WIP.3.patch
>
>
> Implement existing ATS queries with the new ATS reader design.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3049) [Storage Implementation] Implement storage reader interface to fetch raw data from HBase backend

2015-07-20 Thread Zhijie Shen (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14634352#comment-14634352
 ] 

Zhijie Shen commented on YARN-3049:
---

[~sjlee0], yeah, for POC purpose, I temporally do flush upon each put. I 
suspect it will significantly impact the write performance. We may need to sync 
on this issue

> [Storage Implementation] Implement storage reader interface to fetch raw data 
> from HBase backend
> 
>
> Key: YARN-3049
> URL: https://issues.apache.org/jira/browse/YARN-3049
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Zhijie Shen
> Attachments: YARN-3049-WIP.1.patch, YARN-3049-WIP.2.patch, 
> YARN-3049-WIP.3.patch
>
>
> Implement existing ATS queries with the new ATS reader design.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3049) [Storage Implementation] Implement storage reader interface to fetch raw data from HBase backend

2015-07-20 Thread Sangjin Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14634326#comment-14634326
 ] 

Sangjin Lee commented on YARN-3049:
---

I do see that you're adding a call to {{BufferedMutator.flush()}} here, as well 
as part of the fix that went into YARN-3908 (writing to the the event column 
prefix as opposed to the incorrect metric metric column prefix).

I'll go over WIP patch v.2 soon...

> [Storage Implementation] Implement storage reader interface to fetch raw data 
> from HBase backend
> 
>
> Key: YARN-3049
> URL: https://issues.apache.org/jira/browse/YARN-3049
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Zhijie Shen
> Attachments: YARN-3049-WIP.1.patch, YARN-3049-WIP.2.patch, 
> YARN-3049-WIP.3.patch
>
>
> Implement existing ATS queries with the new ATS reader design.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3049) [Storage Implementation] Implement storage reader interface to fetch raw data from HBase backend

2015-07-16 Thread Varun Saxena (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14630867#comment-14630867
 ] 

Varun Saxena commented on YARN-3049:


[~zjshen], should cluster ID be mandatory in REST URL ?
We can assume it to be belonging to same cluster as where this timeline reader 
is running and take it from config, if its not supplied by client.
Thats how I did it in YARN-3814.


> [Storage Implementation] Implement storage reader interface to fetch raw data 
> from HBase backend
> 
>
> Key: YARN-3049
> URL: https://issues.apache.org/jira/browse/YARN-3049
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Zhijie Shen
> Attachments: YARN-3049-WIP.1.patch, YARN-3049-WIP.2.patch
>
>
> Implement existing ATS queries with the new ATS reader design.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3049) [Storage Implementation] Implement storage reader interface to fetch raw data from HBase backend

2015-07-14 Thread Li Lu (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14627349#comment-14627349
 ] 

Li Lu commented on YARN-3049:
-

Hi [~zjshen], I have a concern similar to [~sjlee0]'s, on reading timeline 
metrics:
{code}
+  // Simply assume that if the value set contains more than 1 elements, the
+  // metric is a TIME_SERIES metric, otherwise, it's a TIME_SERIES metric
+  metric.setType(metricResult.getValue().size() > 1 ?
+  TimelineMetric.Type.TIME_SERIES : TimelineMetric.Type.TIME_SERIES);
{code}

I thought you meant to say, if the size of valueSet is greater than one, set 
type to TIME_SERIES, or else, set it to SINGLE_DATA? Or else we cannot read any 
SINGLE_DATA out... 

> [Storage Implementation] Implement storage reader interface to fetch raw data 
> from HBase backend
> 
>
> Key: YARN-3049
> URL: https://issues.apache.org/jira/browse/YARN-3049
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Zhijie Shen
> Attachments: YARN-3049-WIP.1.patch
>
>
> Implement existing ATS queries with the new ATS reader design.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3049) [Storage Implementation] Implement storage reader interface to fetch raw data from HBase backend

2015-07-14 Thread Sangjin Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14627080#comment-14627080
 ] 

Sangjin Lee commented on YARN-3049:
---

Thanks [~zjshen] for your WIP patch! I skimmed through it, and I generally 
agree with the approach you're taking in this patch.

Some early comments and thoughts:
- Later we could work on the filtering code to make it more expressive, etc. I 
see you have defined a number of {{match*()}} methods, and that's a good start 
in that direction.
- {{lookupFlowContext()}}: I suspect we might want to cache the flow context 
for better performance. Ideally it would need to be limited by size (LRU).
- Maybe a nit, but instead of setting something and clearing it later on if it 
is not supposed to be retrieved, how about setting it only if it is supposed to 
be retrieved? I'm talking about code that fetches contents such as relatesTo, 
info, config, events, ...
- {{getEntities()}}: just break instead of pollLast()?
- {{readMetrics()}}: SINGLE_VALUE v. TIME_SERIES?


> [Storage Implementation] Implement storage reader interface to fetch raw data 
> from HBase backend
> 
>
> Key: YARN-3049
> URL: https://issues.apache.org/jira/browse/YARN-3049
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Zhijie Shen
> Attachments: YARN-3049-WIP.1.patch
>
>
> Implement existing ATS queries with the new ATS reader design.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3049) [Storage Implementation] Implement storage reader interface to fetch raw data from HBase backend

2015-07-08 Thread Zhijie Shen (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14619347#comment-14619347
 ] 

Zhijie Shen commented on YARN-3049:
---

Updated the title accordingly to describe the scope of this jira more 
accurately.

> [Storage Implementation] Implement storage reader interface to fetch raw data 
> from HBase backend
> 
>
> Key: YARN-3049
> URL: https://issues.apache.org/jira/browse/YARN-3049
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Zhijie Shen
>
> Implement existing ATS queries with the new ATS reader design.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

48 matches

Mail list logo