[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-05-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14547299#comment-14547299
 ] 

Hadoop QA commented on YARN-3051:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12732621/YARN-3051.wip.02.YARN-2928.patch
 |
| Optional Tests | shellcheck javadoc javac unit findbugs checkstyle |
| git revision | trunk / cab0dad |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/7963/console |


This message was automatically generated.

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: YARN-3051.wip.02.YARN-2928.patch, YARN-3051.wip.patch, 
> YARN-3051_temp.patch
>
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-05-17 Thread Li Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14547304#comment-14547304
 ] 

Li Lu commented on YARN-3051:
-

Hi [~varun_saxena], I think the new patch name pattern should be, 
YARN-3051-YARN-2928.***.patch. Would you please try that again? Thanks! 

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: YARN-3051.wip.02.YARN-2928.patch, YARN-3051.wip.patch, 
> YARN-3051_temp.patch
>
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-05-18 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14548358#comment-14548358
 ] 

Varun Saxena commented on YARN-3051:


Thanks Li for pointing this out. I will anyways updating flow and user based 
APIs and add a few tests. Will take care of naming the patch this way in next 
patch

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: YARN-3051.wip.02.YARN-2928.patch, YARN-3051.wip.patch, 
> YARN-3051_temp.patch
>
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-05-19 Thread Li Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14551265#comment-14551265
 ] 

Li Lu commented on YARN-3051:
-

Hi [~varun_saxena], I just tried to apply the patch against the latest 
YARN-2928 branch, and there was a problem with pom.xml. When generating the 
next patch, could you please double check on that? I think it will be great if 
we can make some progress on the reader side now, so that we can have a working 
end-to-end v2 preview soon. Thanks! 

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: YARN-3051.wip.02.YARN-2928.patch, YARN-3051.wip.patch, 
> YARN-3051_temp.patch
>
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-05-20 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14552534#comment-14552534
 ] 

Varun Saxena commented on YARN-3051:


Well, I am still stuck on trying to get the attribute set via 
HttpServer2#setAttribute in WebServices class. Will update patch once that is 
done.

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: YARN-3051.wip.02.YARN-2928.patch, YARN-3051.wip.patch, 
> YARN-3051_temp.patch
>
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-05-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14553364#comment-14553364
 ] 

Hadoop QA commented on YARN-3051:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 55s | Pre-patch YARN-2928 compilation 
is healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 3 new or modified test files. |
| {color:green}+1{color} | javac |   7m 42s | There were no new javac warning 
messages. |
| {color:red}-1{color} | javadoc |   9m 39s | The applied patch generated  6  
additional warning messages. |
| {color:red}-1{color} | release audit |   0m 19s | The applied patch generated 
2 release audit warnings. |
| {color:red}-1{color} | checkstyle |   1m 19s | The applied patch generated  
23 new checkstyle issues (total was 234, now 257). |
| {color:green}+1{color} | shellcheck |   0m  6s | There were no new shellcheck 
(v0.3.3) issues. |
| {color:red}-1{color} | whitespace |   0m  1s | The patch has 1  line(s) that 
end in whitespace. Use git apply --whitespace=fix. |
| {color:green}+1{color} | install |   1m 40s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 41s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   3m 36s | The patch appears to introduce 6 
new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |   0m 25s | Tests passed in 
hadoop-yarn-api. |
| {color:green}+1{color} | yarn tests |   1m 56s | Tests passed in 
hadoop-yarn-common. |
| {color:green}+1{color} | yarn tests |   1m  3s | Tests passed in 
hadoop-yarn-server-timelineservice. |
| | |  43m 47s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-yarn-server-timelineservice |
|  |  Found reliance on default encoding in 
org.apache.hadoop.yarn.server.timelineservice.storage.FileSystemTimelineReaderImpl.getEntities(String,
 String, String, Long, Long, Long, String, Long, Collection, Collection, 
Collection, Collection, Collection, EnumSet):in 
org.apache.hadoop.yarn.server.timelineservice.storage.FileSystemTimelineReaderImpl.getEntities(String,
 String, String, Long, Long, Long, String, Long, Collection, Collection, 
Collection, Collection, Collection, EnumSet): new java.io.FileReader(File)  At 
FileSystemTimelineReaderImpl.java:[line 88] |
|  |  Found reliance on default encoding in 
org.apache.hadoop.yarn.server.timelineservice.storage.FileSystemTimelineReaderImpl.getEntity(String,
 String, String, String, Collection, Collection, Long, Long, EnumSet):in 
org.apache.hadoop.yarn.server.timelineservice.storage.FileSystemTimelineReaderImpl.getEntity(String,
 String, String, String, Collection, Collection, Long, Long, EnumSet): new 
java.io.FileReader(File)  At FileSystemTimelineReaderImpl.java:[line 68] |
| FindBugs | module:hadoop-yarn-common |
|  |  Inconsistent synchronization of 
org.apache.hadoop.yarn.api.protocolrecords.impl.pb.AllocateResponsePBImpl.builder;
 locked 92% of time  Unsynchronized access at AllocateResponsePBImpl.java:92% 
of time  Unsynchronized access at AllocateResponsePBImpl.java:[line 391] |
|  |  Inconsistent synchronization of 
org.apache.hadoop.yarn.api.protocolrecords.impl.pb.AllocateResponsePBImpl.proto;
 locked 94% of time  Unsynchronized access at AllocateResponsePBImpl.java:94% 
of time  Unsynchronized access at AllocateResponsePBImpl.java:[line 391] |
|  |  Inconsistent synchronization of 
org.apache.hadoop.yarn.api.protocolrecords.impl.pb.AllocateResponsePBImpl.viaProto;
 locked 94% of time  Unsynchronized access at AllocateResponsePBImpl.java:94% 
of time  Unsynchronized access at AllocateResponsePBImpl.java:[line 391] |
| FindBugs | module:hadoop-yarn-api |
|  |  
org.apache.hadoop.yarn.api.records.timelineservice.TimelineMetric$1.compare(Long,
 Long) negates the return value of Long.compareTo(Long)  At 
TimelineMetric.java:value of Long.compareTo(Long)  At TimelineMetric.java:[line 
47] |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12734255/YARN-3051-YARN-2928.03.patch
 |
| Optional Tests | shellcheck javadoc javac unit findbugs checkstyle |
| git revision | YARN-2928 / 463e070 |
| javadoc | 
https://builds.apache.org/job/PreCommit-YARN-Build/8034/artifact/patchprocess/diffJavadocWarnings.txt
 |
| Release Audit | 
https://builds.apache.org/job/PreCommit-YARN-Build/8034/artifact/patchprocess/patchReleaseAuditProblems.txt
 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-YARN-Build/8034/artifact/patchprocess/diffcheckstylehadoop-yarn-api.txt
 |
| whitespace | 
https://builds.apache.org/job/PreCommit-YARN-Build/8034/artifact/patchprocess/whitespace.txt
 |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit

[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-05-27 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14561393#comment-14561393
 ] 

Hadoop QA commented on YARN-3051:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12735644/YARN-3051-YARN-2928.003.patch
 |
| Optional Tests | shellcheck javadoc javac unit findbugs checkstyle |
| git revision | YARN-2928 / e19566a |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8100/console |


This message was automatically generated.

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: YARN-3051-YARN-2928.003.patch, 
> YARN-3051-YARN-2928.03.patch, YARN-3051.wip.02.YARN-2928.patch, 
> YARN-3051.wip.patch, YARN-3051_temp.patch
>
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-05-27 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14561641#comment-14561641
 ] 

Varun Saxena commented on YARN-3051:


In the API designed in the patch, there are few things I wanted to discuss.

#  We can either return a single timeline entity for a flow ID(having 
aggregated metric values)  or multiple entities indicating multiple flows runs 
for a flow ID. I have included an API for the former as of now. I think there 
can be uses cases for both though. [~vrushalic],  did hRaven have the facility 
for both kinds of queries ? I mean, is there a known use case ?
# Do we plan to include additional info in the user table which can be used for 
filtering user level entites ? Could not think of any use case but just for 
flexibility I have added filters in the API {{getUserEntities}}.
# I have included an API to query flow information based on the appid. As of 
now I return the flow to which app belongs to(includes multiple runs) instead 
of flow run it belongs to. Which is a more viable scenario ? Or we need to 
support both ?
# In the HBase schema design, there are 2 flow summary tables aggregated daily 
and weekly respectively. So to limit the number of metric records or to see 
metrics in a specific time window, I have added metric start and metric end 
timestamps in the API design. But if  metrics are aggregated daily and weekly, 
we wont be able to get something like value of specific metric for a flow from 
say Thursday 4 pm to Friday 9 am. [~vrushalic], can you confirm ? If this is 
so, a timestamp doesnt make much sense. Dates can be specified instead.
# Will there be queue table(s) in addition to user table(s) ? If yes, how will 
queue data be aggregated ? Based on entity type ? I may need an additional API 
for queues then.
# The doubt I have regarding flow version will anyways be addressed by YARN-3699

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: YARN-3051-YARN-2928.003.patch, 
> YARN-3051-YARN-2928.03.patch, YARN-3051.wip.02.YARN-2928.patch, 
> YARN-3051.wip.patch, YARN-3051_temp.patch
>
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-05-27 Thread Vrushali C (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14561770#comment-14561770
 ] 

Vrushali C commented on YARN-3051:
--

Hi Varun,

Good points.. My answers inline.
bq. We can either return a single timeline entity for a flow ID(having 
aggregated metric values) or multiple entities indicating multiple flows runs 
for a flow ID. I have included an API for the former as of now. I think there 
can be uses cases for both though. Vrushali C, did hRaven have the facility for 
both kinds of queries ? I mean, is there a known use case ?

Yes, there are use cases for both. hRaven has apis for both types of calls, 
they are named differently though. The /flow endpoint in hRaven will return 
multiple flow runs (limited by filters). The /summary will return aggregated 
values for all the runs of that flow in that time range filter. Let me give an 
example (a hadoop sleep job for simplicity).

Say user janedoe runs a hadoop sleep job 3 times today and has run it 5 times 
yesterday and say 6 times on one day about a month back. Now, we may want to 
see two different things:

#1 summarized stats for flow “Sleep job” invoked between last 2 days: It would 
say this flow was run 8 times, first was at timestamp X, last run was at 
timestamp Y, it took up a total of N megabytemillis, had a total of M 
containers across all runs, etc etc. It tells us how much of the cluster 
capacity a particular flow from a particular user is taking up.

-#2 List of flow runs: Will show us details about each flow run. If we say 
limit = 3 in the query parameters, it would return latest 3 runs of this flow. 
If we say limit = 100, it would return all the runs in this particular case 
(including the ones from a month back). If we pass in flowVersion=XXYYZZ, then 
it would return the list of flows that match this version. 

For the initial development, I think we may want to work on #2 first (return 
list of flow runs). The summary api will need aggregated tables which we can 
add later on, we could file a jira for that, my 2c.

bq. Do we plan to include additional info in the user table which can be used 
for filtering user level entites ? Could not think of any use case but just for 
flexibility I have added filters in the API getUserEntities.

I haven’t looked at the code in detail, but as such, for user level entities, 
we would want time range, limit on number of records returns, flow name filter, 
cluster name filter.

bq. I have included an API to query flow information based on the appid. As of 
now I return the flow to which app belongs to(includes multiple runs) instead 
of flow run it belongs to. Which is a more viable scenario ? Or we need to 
support both ?

An app id can belong to exactly one flow run. App id is the hadoop yarn 
application id, which should be unique on the cluster. Given an app id, we 
should be able to look up the exact flow run and return just that. The 
equivalent api in hRaven is /jobFlow.

bq.  But if metrics are aggregated daily and weekly, we wont be able to get 
something like value of specific metric for a flow from say Thursday 4 pm to 
Friday 9 am. Vrushali C, can you confirm ? If this is so, a timestamp doesnt 
make much sense. Dates can be specified instead.

The thinking is to split the querying across tables. We would query both the 
daily summary table for the complete day details and the regular flow tables 
for the details like those of Thursday 4 pm to Friday 9 am. But this does mean 
aggregating on the query side. So, I think, for starters, we could start off by 
allowing Date boundaries. We can enhance the API to accept finer timestamps 
later.

bq. Will there be queue table(s) in addition to user table(s) ? If yes, how 
will queue data be aggregated ? Based on entity type ? I may need an additional 
API for queues then.
Yes, we would need a queue based aggregation table. Right now, those details 
are to be worked out. So perhaps we can leave aside the queue based APIs (or 
file a different jira to handle queue based apis).

Hope this helps. I can give you more examples if you would like to get more 
details or have any other questions. I will also look at the patch this week.  
Also, we should ensure we use the same classes/methods used for key related 
(flow keys, row keys) construction and parsing across reader apis and writer 
apis else they will diverge.

thanks
Vrushali


> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: YARN-305

[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-05-27 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14562380#comment-14562380
 ] 

Varun Saxena commented on YARN-3051:


Thanks for the replies.




> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: YARN-3051-YARN-2928.003.patch, 
> YARN-3051-YARN-2928.03.patch, YARN-3051.wip.02.YARN-2928.patch, 
> YARN-3051.wip.patch, YARN-3051_temp.patch
>
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-06-01 Thread Li Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14568128#comment-14568128
 ] 

Li Lu commented on YARN-3051:
-

Hi [~varun_saxena], thanks for the work! Not sure if you've already made 
progress since the latest patch, but I'm posting some of my comments and 
questions w.r.t the reader API design in the 003 patch. I may have more 
comments in the near future, but I won't mind to see a new patch before posting 
them. 

# I noticed there is a _readerLimit_ for read operations, which works for ATS 
v1. I'm wondering if it's fine to use -1 to indicate there's no such limit? Not 
sure if this feature is already there. 
# The {{fromId}} parameter, we may need to be careful on the concept of "id". 
In timeline v2 we need context information to identify each entity, such as 
cluster, user, flow, run. When querying with {{fromId}}, what kind of 
assumptions should we make on the "id" here? Are we assuming all entities are 
of the same cluster, user, and/or flow, or the "id" is a concatenation of all 
information, or it's something else? 
# For all filters related parameters, I'm not sure if the current object model 
and storage implementation support a trivial solution. I'd certainly welcome 
any comments/suggestions on this problem. 
# Based on the previous two issues, a more general question is, shall we focus 
on a evolution of the v1 API here, or we start a v2 reader API design from the 
scratch, and then try to make them compatible to the v1 APIs? The current patch 
looks to be pursuing the evolution approach. 
# In some APIs, we're requiring clusterID and appID, but not having flow/run 
information. In the current writer implementations, this indicates a full table 
scan. Maybe we can have flow and run information as optional parameters so that 
we can avoid full table scans when the caller does have flow and run 
information?
# The current APIs require a pretty long list of parameters. For most of the 
use cases, I think we can abstract something much simpler. Do we plan to add 
those "simple APIs" in a higher layer? I think having a lot of nulls when 
calling reader API looks suboptimal, but with only these few APIs we may need 
to do this frequently?  

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: YARN-3051-YARN-2928.003.patch, 
> YARN-3051-YARN-2928.03.patch, YARN-3051.wip.02.YARN-2928.patch, 
> YARN-3051.wip.patch, YARN-3051_temp.patch
>
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-06-03 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14570941#comment-14570941
 ] 

Varun Saxena commented on YARN-3051:


bq. I noticed there is a readerLimit for read operations, which works for ATS 
v1. I'm wondering if it's fine to use -1 to indicate there's no such limit? Not 
sure if this feature is already there.
You mean limit to limit the number of records ?

bq. The fromId parameter, we may need to be careful on the concept of "id". In 
timeline v2 we need context information to identify each entity, such as 
cluster, user, flow, run. When querying with fromId, what kind of assumptions 
should we make on the "id" here?
{{fromId}} is primarily there to be backward compatible with ATS v1. It is used 
in context of entity ID only. This will be documented in the javadoc. I have 
not changed names of the query params (if these parameters are supported in ATS 
v1).
Whether we need to support same REST endpoints as ATS v1 for the sake of 
backward compatibility or whether we can break the backward compatibility(in 
case of no use case) is something which I wanted to discuss. Commented on 
YARN-3411 as well regarding one such param.

bq. In some APIs, we're requiring clusterID and appID, but not having flow/run 
informationMaybe we can have flow and run information as optional 
parameters so that we can avoid full table scans when the caller does have flow 
and run information?
Agree with your suggestion. Even I was thinking about including them in the 
next patch as query params. This will make the parameter list even longer :)

bq. The current APIs require a pretty long list of parameters. For most of the 
use cases, I think we can abstract something much simpler.
These parameters are directly fetched from query params coming in REST API and 
are directly passed down to storage layer(after minor verification). Yes, we 
can decide on few of the key parameters(which correspond to row key/primary 
key) and have different methods for that. And have different reader API methods 
for them as well.

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: YARN-3051-YARN-2928.003.patch, 
> YARN-3051-YARN-2928.03.patch, YARN-3051.wip.02.YARN-2928.patch, 
> YARN-3051.wip.patch, YARN-3051_temp.patch
>
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-06-04 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14573830#comment-14573830
 ] 

Zhijie Shen commented on YARN-3051:
---

[~varun_saxena], thanks for working on the new patch. It seems to be a complete 
reader side protype, which is nice. I still need some time to take thorough 
look, but I'd like to my thoughts about the reader APIs.

IMHO, we may want to have or start with two sets of APIs: 1) the APIs to query 
the raw data and 2) the APIs to query the aggregation data.

1) APIs to query the raw data:

We would like to have the APIs to let users zoom into the details about their 
jobs, and give users the freedom to fetch the raw data and do the customized 
process that ATS will not do. For example, Hive/Pig on Tez need this set of 
APIs to get the framework specific data, process it and render it on their on 
web UI. We basically need 2 such APIs.

a. Get a single entity given an ID that uniquely locates the entity in the 
backend (We assume the uniqueness is assured somehow). 
* This API can be extended or split into multiple sub-APIs to get a single 
element of the entity, such as events, metrics and configuration.

b. Search for a set entities that match the given predicates.
* We can start from the predicates that we used in ATS v1 (also for the 
compatibility purpose), but some of them may no longer apply.
* We may want to add more predicates to check the newly added element in v2.
* With more predefined semantics, we can even query entities that belong to 
some container/attempt/application and so on.

2) APIs to query the aggregation data

These are complete new in v2 and are the advantage. With the aggregation, we 
can answer some statistical questions about the job, the user, the queue, the 
flow and the cluster. These APIs are not directing users to the individual 
entities put by the application, but returning statistical data (carried by 
Application|User|Queue|Flow|ClusterEntity). 

a. Get certain level aggregation data given the ID of the concept on that 
level, i.e.,  the job, the user, the queue, the flow and the cluster.

b. Search for the the jobs, the users, the queues, the flows and the clusters 
given predicates.
* For the predicates, we could learn from the examples in hRaven.


> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: YARN-3051-YARN-2928.003.patch, 
> YARN-3051-YARN-2928.03.patch, YARN-3051.wip.02.YARN-2928.patch, 
> YARN-3051.wip.patch, YARN-3051_temp.patch
>
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-06-10 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14580858#comment-14580858
 ] 

Varun Saxena commented on YARN-3051:


[~zjshen], thanks for your inputs. I will brief you about the APIs' I have 
decided as of now.

# APIs' for querying individual entity/flow/flow run/user and APIs' for 
querying a set of entities/flow runs/flows/users. APIs' such a set of 
flows/users will contain aggregated data. The reason for separate endpoints for 
entities, flows, users,etc. is because of the different tables in HBase/Phoenix 
schema.
# Most the APIs' will be variations of either getting a single entity or a set 
of entities. So I will primarily talk about entity and a set of entities in 
subsequent points.
# For getting a set of entities, there will be 3 kinds of filters - filtering 
on the basis of info, filtering on configs and filtering on metrics. Filtering 
on the basis of info and field will be based on equality, for instance, fetch 
entities which have a config name matching a specific config value. Metrics 
filtering though will be on the basis of relational operator. For instance, 
user can query entities which have a specific metric >= a certain value.
# In addition to that certain predicates such as limit, windowStart, windowEnd, 
etc. which used to exist in ATSv1 exist even now.Some predicates such as 
fromId, fromTs may not make sense in ATSv2 but I have included them for now 
with the intention of discussion.
# Additional predicates such as metricswindowStart and end has been specified 
to fetch metrics data for a specific time span. The reason I included this is 
because this can aid in plotting graphs on UI for a specific metric of some 
entity.
# Only entity id, type, created and modified time will be returned if fields 
are not specified in REST URL. This will be the default view of an entity.
# Moreover you can also specify which configurations and metrics to return.
# Every query param will be received as a String, even timestamp. Now from 
backing storage implementation viewpoint, would it make more sense to let these 
query params be passed as strings or do datatype conversion ?

Few concerns from Li Lu regarding parameter list becoming too long are quite 
valid as most of them will be nulls. We can also club multiple related 
parameters in a different classes to reduce them. Or as he said have different 
methods for frequently occurring use cases. Thoughts ?

Comments are welcome so that this JIRA can speed up, probably after Hadoop 
Summit :)

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: YARN-3051-YARN-2928.003.patch, 
> YARN-3051-YARN-2928.03.patch, YARN-3051.wip.02.YARN-2928.patch, 
> YARN-3051.wip.patch, YARN-3051_temp.patch
>
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-06-10 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14580910#comment-14580910
 ] 

Varun Saxena commented on YARN-3051:


As of now, there are very similar APIs' for 
getEntity/getFlowEntity/getUserEntity etc. Will it be fine to combine these 
APIs' and pass something like a query type(ENTITY/USER/FLOW,etc.) in the API 
which storage implementation can then use to decide which type of query it is ?

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: YARN-3051-YARN-2928.003.patch, 
> YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, 
> YARN-3051.wip.02.YARN-2928.patch, YARN-3051.wip.patch, YARN-3051_temp.patch
>
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-06-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14580961#comment-14580961
 ] 

Hadoop QA commented on YARN-3051:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  17m 26s | Findbugs (version ) appears to 
be broken on YARN-2928. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 3 new or modified test files. |
| {color:green}+1{color} | javac |   7m 56s | There were no new javac warning 
messages. |
| {color:red}-1{color} | javadoc |  10m 12s | The applied patch generated  11  
additional warning messages. |
| {color:green}+1{color} | release audit |   0m 24s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   1m 22s | The applied patch generated  
25 new checkstyle issues (total was 243, now 267). |
| {color:green}+1{color} | shellcheck |   0m  6s | There were no new shellcheck 
(v0.3.3) issues. |
| {color:green}+1{color} | whitespace |   0m  1s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 40s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 40s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   4m  2s | The patch appears to introduce 5 
new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |   0m 22s | Tests passed in 
hadoop-yarn-api. |
| {color:green}+1{color} | yarn tests |   1m 59s | Tests passed in 
hadoop-yarn-common. |
| {color:green}+1{color} | yarn tests |   1m 27s | Tests passed in 
hadoop-yarn-server-timelineservice. |
| | |  48m  2s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-yarn-server-timelineservice |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12738884/YARN-3051-YARN-2928.04.patch
 |
| Optional Tests | shellcheck javadoc javac unit findbugs checkstyle |
| git revision | YARN-2928 / 0a3c147 |
| javadoc | 
https://builds.apache.org/job/PreCommit-YARN-Build/8234/artifact/patchprocess/diffJavadocWarnings.txt
 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-YARN-Build/8234/artifact/patchprocess/diffcheckstylehadoop-yarn-api.txt
 |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-YARN-Build/8234/artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-timelineservice.html
 |
| hadoop-yarn-api test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8234/artifact/patchprocess/testrun_hadoop-yarn-api.txt
 |
| hadoop-yarn-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8234/artifact/patchprocess/testrun_hadoop-yarn-common.txt
 |
| hadoop-yarn-server-timelineservice test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8234/artifact/patchprocess/testrun_hadoop-yarn-server-timelineservice.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8234/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8234/console |


This message was automatically generated.

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: YARN-3051-YARN-2928.003.patch, 
> YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, 
> YARN-3051.wip.02.YARN-2928.patch, YARN-3051.wip.patch, YARN-3051_temp.patch
>
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-06-12 Thread Li Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14583839#comment-14583839
 ] 

Li Lu commented on YARN-3051:
-

Hi [~varun_saxena], thanks for the update! Some of my quick thoughts for 
discussion...
# I just realized in this JIRA we are creating "backing storage read interface 
for ATS readers", but not the user facing ATS reader APIs. I believe these two 
topics are different: in this JIRA we're "wiring up" the storage systems, but 
in ATS reader APIs, we need to deal with user requirements. This said, I think 
the main design goal here is to provide a small set of generic interfaces so 
that we can easily connect them to our writers. We may want to have some brief 
ideas of the potential user facing features (as [~zjshen] mentioned in a 
previous comment), but I'm not sure if we need to implement them before we make 
a concrete design for the storage read interface. 
# If my understanding in point 1 is right, then perhaps we do not need to quite 
worry about the huge list of nulls. Of course, on code level we may want to to 
some cosmetic fixes, but since those interfaces are not user facing, making 
them more general may be more important I think?
# I still think when doing the v2 interface design, it is fine, if not even 
beneficial, to start from scratch rather than thinking about the existing v1 
design. If we're not implementing some v1 features as first-class in v2 storage 
implementations, maybe we can simply leave them out from the interfaces to 
storage level? (I assume we'll have an intermediate layer to do the wire up 
between our user facing reader APIs and the storage interfaces. )
# bq. Now from backing storage implementation viewpoint, would it make more 
sense to let these query params be passed as strings or do datatype conversion ?
I've got no strong preference on this. Leaving them as a generic type (like 
string) gives the storage layer more freedom to interpret the data, but the 
readers need to ensure they understand the types by themselves. 

BTW, could you please briefly skim through the list of Jenkins warnings and see 
if they're critical? Thanks! 

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: YARN-3051-YARN-2928.003.patch, 
> YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, 
> YARN-3051.wip.02.YARN-2928.patch, YARN-3051.wip.patch, YARN-3051_temp.patch
>
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-06-12 Thread Li Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14583878#comment-14583878
 ] 

Li Lu commented on YARN-3051:
-

I verified locally that the pre-patch findbugs warnings no longer exists. 

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: YARN-3051-YARN-2928.003.patch, 
> YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, 
> YARN-3051.wip.02.YARN-2928.patch, YARN-3051.wip.patch, YARN-3051_temp.patch
>
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-06-12 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14584148#comment-14584148
 ] 

Zhijie Shen commented on YARN-3051:
---

bq. APIs' for querying individual entity/flow/flow run/user and APIs' for 
querying a set of entities/flow runs/flows/users. APIs' such a set of 
flows/users will contain aggregated data. The reason for separate endpoints for 
entities, flows, users,etc. is because of the different tables in HBase/Phoenix 
schema.

I think we don't store the first class citizen entity in a different way and in 
different tables (Li/Vrushali, correct me If I'm wrong). When fetching an 
entity, it doesn't matter it is a customized entity or a predefined entity such 
as ApplicationEntity.

In fact, we have two level of interfaces. One is the storage interface and the 
other is user-oriented interface. I think it's a good idea to let the 
user-oriented interface to have more specific/advanced APIs to handle the 
special entity objects, the storage interface could have fewer, more uniformed 
APIs to reuse the common logic as much as possible. Thoughts?

bq. Every query param will be received as a String, even timestamp. Now from 
backing storage implementation viewpoint, would it make more sense to let these 
query params be passed as strings or do datatype conversion ?

I think we need to take the generic type as the param. If it's transformed to a 
string, it is likely to be difficult to recover the original type information. 
For example, when we see a string "true", how do we know whether it used to be 
a "true" string too or a true boolean. Also, "1234567" is a number or is a 
string that represents a vehicle license.


> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: YARN-3051-YARN-2928.003.patch, 
> YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, 
> YARN-3051.wip.02.YARN-2928.patch, YARN-3051.wip.patch, YARN-3051_temp.patch
>
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-06-12 Thread Li Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14584197#comment-14584197
 ] 

Li Lu commented on YARN-3051:
-

bq. APIs' for querying individual entity/flow/flow run/user and APIs' for 
querying a set of entities/flow runs/flows/users. APIs' such a set of 
flows/users will contain aggregated data.
bq. I think we don't store the first class citizen entity in a different way 
and in different tables (Li/Vrushali, correct me If I'm wrong). When fetching 
an entity, it doesn't matter it is a customized entity or a predefined entity 
such as ApplicationEntity.

If we're discussing about storage read interface, why is it harmful to 
explicitly separate interfaces for raw data and aggregated data, as [~zjshen] 
proposed before? We can work on the raw data interface first, when designing 
aggregations. 

bq. If it's transformed to a string, it is likely to be difficult to recover 
the original type information. 

I agree. A follow up concern is, who to maintain, or explain, the type 
information? I assume we need the readers themselves to keep track of this? 

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: YARN-3051-YARN-2928.003.patch, 
> YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, 
> YARN-3051.wip.02.YARN-2928.patch, YARN-3051.wip.patch, YARN-3051_temp.patch
>
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-06-14 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14585178#comment-14585178
 ] 

Varun Saxena commented on YARN-3051:


bq. I think it's a good idea to let the user-oriented interface to have more 
specific/advanced APIs to handle the special entity objects, the storage 
interface could have fewer, more uniformed APIs to reuse the common logic as 
much as possible. Thoughts?
After adding a lot of similar APIs' even I am of the same view. A lot more 
detail can be added in javadoc.
This would reduce code bloating. 

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: YARN-3051-YARN-2928.003.patch, 
> YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, 
> YARN-3051.wip.02.YARN-2928.patch, YARN-3051.wip.patch, YARN-3051_temp.patch
>
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-06-16 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14589036#comment-14589036
 ] 

Sangjin Lee commented on YARN-3051:
---

Sorry it has taken me a while to chime in on this JIRA. I've just gone over the 
recent comments, and also skimmed through the latest patch. BTW, the latest 
patch doesn't seem to apply cleanly (conflicts on {{yarn.cmd}}). 
[~varun_saxena], could you kindly check the latest patch to see if it needs to 
be updated?

I agree with most of the ideas put forward by folks in the comments. I agree 
with [~zjshen] that it'd be desirable to have more specific APIs for the 
user-oriented side of the code and have bit more generic (for lack of a better 
term) APIs on the side of the storage interaction (namely the 
{{TimelineReader}} interface in its current form).

The goals of the {{TimelineReader}} API is, first, it should be 
generic/flexible enough to accommodate a wide range of queries being asked, 
including the current queries as well as possible future queries, and second, 
it should help the storage implementations translate them into efficient 
queries onto the storage itself.

One idea that may help in this regard is to create further coarse-grained 
concepts and use them in the {{TimelineReader}} API. It's already doing that to 
some extent, and we should push that some more. For instance, it might be 
helpful to create *{{Context}}*. The unique context for most of the queries 
would involve the cluster id and the app id. So we can make cluster id and the 
app id part of the {{Context}} object and have {{TimelineReader}} deal with 
{{Context}} instead of enumerating things like cluster id explicitly in its 
methods.

Similarly, we might want to define *predicates and/or filters*, and use them in 
the {{TimelineReader}} API. In essence, one way to look at it is that a query 
onto the storage is really (context) + (predicate/filters) + (contents to 
retrieve). Then we could consolidate arguments into these coarse-grained things.

Also, for the context, I don't think we need to require things like flow id or 
flow run id. The storage should be able to define the context and locate 
entities only with cluster id and the app id.

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: YARN-3051-YARN-2928.003.patch, 
> YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, 
> YARN-3051.wip.02.YARN-2928.patch, YARN-3051.wip.patch, YARN-3051_temp.patch
>
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-06-17 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14590133#comment-14590133
 ] 

Zhijie Shen commented on YARN-3051:
---

[~sjlee0], thanks for your chiming in. Varun, Li and I recently have a offline 
discussion. In general, we agreed on focusing on storage-oriented interface 
(raw data query) together with a FS implementation of it on this jira, but 
spinning off change about the user-oriented interface, web front wire up, and 
single reader daemon setup and dealing with them separately. The rationale is 
to roll out the reader interface fast, and we can work the HBase/Phoenix 
implement and web front wireup on a commonly agreed interface in parallel. How 
do you think about the plan?

bq.  It's already doing that to some extent, and we should push that some more. 
For instance, it might be helpful to create Context. 

Context is useful. Instead of creating a new one, maybe we can reuse the 
existing Context, which hosts more content than reader needs. So we just need 
to let reader put/get the required information to/from it.

bq. In essence, one way to look at it is that a query onto the storage is 
really (context) + (predicate/filters) + (contents to retrieve). Then we could 
consolidate arguments into these coarse-grained things.

+1 LGTM, but I think it's for the query of searching a set of qualified 
entities, right. For fetching a single entity, the query may look like 
(context) + (entity identifier) + (contents to retrieve)

Another issue I want to raise is that after our performance evaluation, we 
agreed on using HBase for raw data and Phoenix for aggregated data. It implies 
that we need to use HBase to implement the APIs for the raw entities, while use 
Phoenix to implement the APIs for the aggregated data.

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: YARN-3051-YARN-2928.003.patch, 
> YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, 
> YARN-3051.wip.02.YARN-2928.patch, YARN-3051.wip.patch, YARN-3051_temp.patch
>
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-06-18 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14592261#comment-14592261
 ] 

Sangjin Lee commented on YARN-3051:
---

{quote}
Varun, Li and I recently have a offline discussion. In general, we agreed on 
focusing on storage-oriented interface (raw data query) together with a FS 
implementation of it on this jira, but spinning off change about the 
user-oriented interface, web front wire up, and single reader daemon setup and 
dealing with them separately. The rationale is to roll out the reader interface 
fast, and we can work the HBase/Phoenix implement and web front wireup on a 
commonly agreed interface in parallel. How do you think about the plan?
{quote}
Agreed with the approach. I would go so far as focusing on the raw data reader 
part first and get that done and get to the aggregated reader later. Thoughts?

{quote}
Context is useful. Instead of creating a new one, maybe we can reuse the 
existing Context, which hosts more content than reader needs. So we just need 
to let reader put/get the required information to/from it.
{quote}
It should be fine, as long as it is clear we don't need to fill in all the info 
for the read path.

{quote}
+1 LGTM, but I think it's for the query of searching a set of qualified 
entities, right. For fetching a single entity, the query may look like 
(context) + (entity identifier) + (contents to retrieve)
{quote}
Yes, I agree. One can think of the entity id is a special form of a "predicate" 
still. I'm not married to exactly one API; just the need to use a more 
coarse-grained approach.

{quote}
Another issue I want to raise is that after our performance evaluation, we 
agreed on using HBase for raw data and Phoenix for aggregated data. It implies 
that we need to use HBase to implement the APIs for the raw entities, while use 
Phoenix to implement the APIs for the aggregated data.
{quote}
We discussed this offline. We can have a couple of different approaches for 
this. We could either have separate reader APIs for raw data and (time-based) 
aggregated data. Or we could hide the separation behind a facade reader 
implementation that dispatches calls to a HBase reader impl for raw data and 
those to a phoenix impl for aggregated data. Either way, it should be pretty 
straightforward.


> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: YARN-3051-YARN-2928.003.patch, 
> YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, 
> YARN-3051.wip.02.YARN-2928.patch, YARN-3051.wip.patch, YARN-3051_temp.patch
>
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-06-18 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14592536#comment-14592536
 ] 

Zhijie Shen commented on YARN-3051:
---

bq. Agreed with the approach. I would go so far as focusing on the raw data 
reader part first and get that done and get to the aggregated reader later. 
Thoughts?

Exactly. Based on the discussion so far, I've scratched a patch of reader APIs 
and attached it here. It just contains two methods: one to fetch a single 
entity and the other to search for a set of entities with given predicates. For 
the predicate, I start with the common stuff that we have for timeline service 
v2 data model.

Please take a look. Hopefully the folks can be generally satisfactory about the 
APIs. Then we can start from here, have more iterations to enrich the query 
semantics and support backward compatibility. Thoughts?

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: YARN-3051-YARN-2928.003.patch, 
> YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, 
> YARN-3051.Reader_API.patch, YARN-3051.wip.02.YARN-2928.patch, 
> YARN-3051.wip.patch, YARN-3051_temp.patch
>
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-06-18 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14593029#comment-14593029
 ] 

Varun Saxena commented on YARN-3051:


bq. Or we could hide the separation behind a facade reader implementation that 
dispatches calls to a HBase reader impl for raw data and those to a phoenix 
impl for aggregated data
+1 for this.

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: YARN-3051-YARN-2928.003.patch, 
> YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, 
> YARN-3051.Reader_API.patch, YARN-3051.wip.02.YARN-2928.patch, 
> YARN-3051.wip.patch, YARN-3051_temp.patch
>
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-06-18 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14593051#comment-14593051
 ] 

Varun Saxena commented on YARN-3051:


[~zjshen], regarding the reader API patch submitted by you, and comparing it 
with the patches already submitted, some comments :

{code}
  TimelineEntity getEntity(
 String clusterId, String appId, String entityType, String entityId,
 EnumSet fieldsToRetrieve) throws IOException;

  Set getEntities(
  String clusterId, String appId, Set entityTypes, Long limit,
  Long createdTimeBegin, Long createdTimeEnd, Long modifiedTimeBegin,
  Long modifiedTimeEnd, Set relatesTo,
  Set isRelatedTo, Set info,
  Set configs, Set events, Set metrics,
  EnumSet fieldsToRetrieve) throws IOException;
{code}

* We had decided that user may not need to retrieve all the configs and metrics 
and hence we should have a parameter to indicate that ? A list of metrics and 
confs user wants to retrieve ? For both the APIs'. I had included this in the 
patch I had made. Do we need it ?
* Shouldn't we have metrics filters to support queries like fetch entities 
which have a metric > a certain value. In the patch I had included support for 
relational operators.
* A query use case for having relatesTo and isRelatedTo as filters ?
* We do not need flowId and flowRunId to get an entity. But it can still be an 
optional argument so that we avoid peek into the table which gets them based on 
cluster and appid. Thoughts ?
* Will we fetch entities across entityTypes ? We also have events as filters 
here. They may not match across entity types. Thoughts ?
* As per our previous discussion I had also included metrics time windows in 
the APIs'. This may aid in plotting graphs for long running apps. Thoughts ?

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: YARN-3051-YARN-2928.003.patch, 
> YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, 
> YARN-3051.Reader_API.patch, YARN-3051.wip.02.YARN-2928.patch, 
> YARN-3051.wip.patch, YARN-3051_temp.patch
>
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-06-19 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14593627#comment-14593627
 ] 

Zhijie Shen commented on YARN-3051:
---

First of all, I'd like to say it's not the finalized the reader API, but the 
one we are okay to start with: two types of query, and the set of essential 
parameters, which focus on tuning what entities to return. We can definitely 
iterate over the APIs to add more parameters to trim the results, and to 
control sub-entity information.

bq. We had decided that user may not need to retrieve all the configs and 
metrics and hence we should have a parameter to indicate that ? A list of 
metrics and confs user wants to retrieve ? For both the APIs'. I had included 
this in the patch I had made. Do we need it ?

Yeah, we could have these parameters, but I'm wondering the efficient way to 
retrieve part of the configs/metrics in a huge set. For example, if I'm 
interested in all the mapred configs of my job. What should I do? Enumerate all 
the mapred configs I want to retrieve in the query parameter is a nightmare. My 
immediate thought about it is regex, but I don't want to include this parameter 
into the original version until we're clear about how to specify it.

bq. Shouldn't we have metrics filters to support queries like fetch entities 
which have a metric > a certain value. In the patch I had included support for 
relational operators.

We should. See my TODO comment. The problem again is that it's not a simple 
predicate. How do we want to abstract and support it? You give the example ">", 
but we need to take care of "<", "=", "!=", "like" and so on.

bq. We do not need flowId and flowRunId to get an entity. But it can still be 
an optional argument so that we avoid peek into the table which gets them based 
on cluster and appid. Thoughts ?

Yeah, it makes sense to. Image we have the web UI, and user is directed from 
flow page to the app page and move on, he's going to carry the flow 
information. If user can provide flowId//flowRunId, we can more efficiently 
locate the entity. We can have the two params, make them optional. Also, it 
seems that I've missed userId too. It's the first piece that the consists of 
the entity key. IMHO, we should have it and make it mandatory to avoid scan 
through the whole key space. And It should be reasonable that we take the 
requester as the user and only search into his entity space, but not others.

bq. Will we fetch entities across entityTypes ? We also have events as filters 
here. They may not match across entity types. Thoughts ?

Good point, let's go with single entityType first.

bq. As per our previous discussion I had also included metrics time windows in 
the APIs'. This may aid in plotting graphs for long running apps. Thoughts ?

This seems to belong to (contents to retrieve), and not difficult to enforce 
the window. We can add this into the param list. One question is whether we 
want to specify the window per metric or for all metrics. Personally, I prefer 
to defer it together with fetching particular configs/metrics in a later 
enhancement about (contents to retrieve). How do you think?

I've updated the Reader interface accordingly.

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: YARN-3051-YARN-2928.003.patch, 
> YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, 
> YARN-3051.Reader_API.patch, YARN-3051.Reader_API_1.patch, 
> YARN-3051.wip.02.YARN-2928.patch, YARN-3051.wip.patch, YARN-3051_temp.patch
>
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-06-19 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14593710#comment-14593710
 ] 

Varun Saxena commented on YARN-3051:


Thanks [~zjshen] for your comments.

bq. Yeah, we could have these parameters, but I'm wondering the efficient way 
to retrieve part of the configs/metrics in a huge set. 
Makes sense. We could use a regex or club different configs into different 
groups and let user query that group. But then the problem will be how do we 
specify those groups. So as you say lets defer it and discuss it at length when 
we take it up.

bq. You give the example ">", but we need to take care of "<", "=", "!=", 
"like" and so on.
Yes we should support all relational operators. I had implemented it as well in 
the patch.  We can defer this though if we do not envisage having store 
implementations for this as of now.

bq. Personally, I prefer to defer it together with fetching particular 
configs/metrics in a later enhancement about (contents to retrieve). How do you 
think?
Ok, lets defer it.

Overall the proposed store interface in the latest attached file LGTM. I will 
go ahead and implement it over the weekend if no further comments come.

One thing though, along the lines of patch submitted earlier, I can include 
something like {{Map}} for metrics in the interface 
for specifying relational operations . It will support things like metricA>val1 
and metricA [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: YARN-3051-YARN-2928.003.patch, 
> YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, 
> YARN-3051.Reader_API.patch, YARN-3051.Reader_API_1.patch, 
> YARN-3051.wip.02.YARN-2928.patch, YARN-3051.wip.patch, YARN-3051_temp.patch
>
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-06-19 Thread Joep Rottinghuis (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14593826#comment-14593826
 ] 

Joep Rottinghuis commented on YARN-3051:


Not all arguments are equally selective. For example, relatesTo (entities) are 
not stored in individual cells that can be used as a push down predicate for 
the HBase tables. We'd have to select all entities that match the other 
criteria, select the relatesTo string, parse it into individual fields and do 
set operations on them.
{code}
  Set getEntities(String userId, String clusterId, String 
flowId,
  String flowRunId, String appId, String entityType, Long limit,
  Long createdTimeBegin, Long createdTimeEnd, Long modifiedTimeBegin,
  Long modifiedTimeEnd, Set relatesTo,
  Set isRelatedTo, Set info,
  Set configs, Set events, Set metrics,
  EnumSet fieldsToRetrieve) throws IOException;
}
{code}

If we defer being able to effectively select a subset of columns, what does it 
actually mean to specify a Set ?
Can the value be null to indicate that we don't care what the value is and that 
means that we want the column back in the result?

I think we should separate out predicates (give me all X where Y=Z) versus 
selectors (give me all X...).
It is not clear in the latest patch if fully populated entities will be 
returned.

Wrt.
{quote}
Makes sense. We could use a regex or club different configs into different 
groups and let user query that group. But then the problem will be how do we 
specify those groups. So as you say lets defer it and discuss it at length when 
we take it up.
{quote}
and
{quote}
One thing though, along the lines of patch submitted earlier, I can include 
something like Map for metrics in the interface for 
specifying relational operations . It will support things like metricA>val1 and 
metricAhttps://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/FilterBase.html)
 to aggressively reduce what we pull back from HBase. ColumnPrefixFilter for 
example will be a good way to express which config columns to retrieve. A regex 
will be a poor way, as that will result in having to pull back every columns, 
and then dropping values from a retrieved result.

Similarly, if our rowkeys are prefixed by users then creating an API that 
doesn't include the user (only the cluster) means that we're doing a full table 
scan, albeit with skipfilters that let us skip over users that we're not 
interested in.

In an earlier patch I saw NameValueRelation that was able to perform the 
operations. That again assumes that all values will be retrieved from the 
backing store, and then filtered in the reader before returned to the user. It 
will be more effective to make sure we can easily map this to operations we can 
push into HBase itself (through a ColumnValueFilter) through the available 
operations 
(https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/CompareFilter.CompareOp.html).
I'm certainly not arguing to have these HBase specific classes exposed in our 
API, but our methods should closely match what can be done, which I don't think 
will be overly restrictive or unreasonable.

If we're going to have two types of tables in the backing store:
a) HBase native tables, specifically structured for efficient storage and 
retrieval
and 
b) Phoenix tables (mainly time based aggregates and aggregates over non-primary 
key prefixes), specifically structured for flexible querying
would it make sense to break these two queries into separate families?
Or are we thinking that based on what arguments are passed in, we decide which 
tables to query with which mechanism?


> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: YARN-3051-YARN-2928.003.patch, 
> YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, 
> YARN-3051.Reader_API.patch, YARN-3051.Reader_API_1.patch, 
> YARN-3051.wip.02.YARN-2928.patch, YARN-3051.wip.patch, YARN-3051_temp.patch
>
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-06-19 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14594029#comment-14594029
 ] 

Zhijie Shen commented on YARN-3051:
---

Thanks for chiming in, Jeop! Here's my reply:

bq. It is not clear in the latest patch if fully populated entities will be 
returned.

We may not need to worry about it too much. The two APIs are supposed to fetch 
the raw data. We use user Id + cluster Id + app Id + entity type to efficiently 
narrow down the scope to search for entities, and limit the number of entities 
there could be. The following optional parameters will further trim the result 
set.

bq. If we defer being able to effectively select a subset of columns, what does 
it actually mean to specify a Set ?
Can the value be null to indicate that we don't care what the value is and that 
means that we want the column back in the result?

I have update the javadoc to be more specify what the parameters are supposed 
to do and whether they're mandatory or optional.

bq. I'm certainly not arguing to have these HBase specific classes exposed in 
our API, but our methods should closely match what can be done, which I don't 
think will be overly restrictive or unreasonable.

I think it's a good suggestion. We should double check if we can easily map our 
customized filters can be easily mapped to some HBase filters.

Put a new patch

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: YARN-3051-YARN-2928.003.patch, 
> YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, 
> YARN-3051.Reader_API.patch, YARN-3051.Reader_API_1.patch, 
> YARN-3051.wip.02.YARN-2928.patch, YARN-3051.wip.patch, YARN-3051_temp.patch
>
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-06-19 Thread Joep Rottinghuis (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14594067#comment-14594067
 ] 

Joep Rottinghuis commented on YARN-3051:


Thanks [~zjshen] those additional comments in javadoc explain the bigger 
picture.

A few more question that would be good to clarifiy:
{code}
120*  @return a set of {@link TimelineEntity} instances of the given 
entity type
121* in the given context scope which matches the given 
predicates
122* ordered by created time. Each entity will only contain the 
metadata
123* plus the given fields to retrieve
{code}
with "matches" presumably you mean an _and_ relationship, all must be true, not 
_or_ where only one of them need to match correct?

l122 The "ordered by creation time." refers to how the optional limit is 
applied, not that we actually return an ordered set right?

{code}
118* @param fieldsToRetrieve
118* the fields to be be returned (optional, by default {@link 
Field#ALL}
119* will be retrieved)
{code}
Probably obvious, but once ALL is specified, all fields will be returned, even 
if only some Fields are specified and others not.



> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: YARN-3051-YARN-2928.003.patch, 
> YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, 
> YARN-3051.Reader_API.patch, YARN-3051.Reader_API_1.patch, 
> YARN-3051.Reader_API_2.patch, YARN-3051.wip.02.YARN-2928.patch, 
> YARN-3051.wip.patch, YARN-3051_temp.patch
>
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-06-19 Thread Joep Rottinghuis (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14594080#comment-14594080
 ] 

Joep Rottinghuis commented on YARN-3051:


The same question goes for the items in the Set 
relatesTo etc. Do the retrieved entities have to have at least one of the 
related to entities match, or all of them? What if there are more related 
entities, do we want to retrieve only those with the provided related entities 
but no more?

It sounds like nit-picking, but the implementations would differ quite a bit, 
so it is good to express what it is that we want to do.

Rather than locking in on one interpretation, what if we take a page out of the 
HBase manual and we could specify that a filter needs to be applied? We can 
then supply RelatesToFilter, InfoFilter, etc.
Filters can be combined with FilterList where you can specify MUST_PASS_ALL, 
MUST_PASS_ONE (see for example 
https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/FilterList.html).

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: YARN-3051-YARN-2928.003.patch, 
> YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, 
> YARN-3051.Reader_API.patch, YARN-3051.Reader_API_1.patch, 
> YARN-3051.Reader_API_2.patch, YARN-3051.wip.02.YARN-2928.patch, 
> YARN-3051.wip.patch, YARN-3051_temp.patch
>
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-06-19 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14594098#comment-14594098
 ] 

Zhijie Shen commented on YARN-3051:
---

Yeah, I meant to be AND logic for these parameters. I think it's good to be 
declare explicitly.

To extend the parameter, we can add more parameter like AND and OR. I agree 
it's good to take a look at HBase filter abstraction, and draft ours 
accordingly. I consider them as the code improvement of filter abstraction, but 
other than it, hopefully we can agree on using these filters.

bq. Probably obvious, but once ALL is specified, all fields will be returned, 
even if only some Fields are specified and others not.

Exactly.

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: YARN-3051-YARN-2928.003.patch, 
> YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, 
> YARN-3051.Reader_API.patch, YARN-3051.Reader_API_1.patch, 
> YARN-3051.Reader_API_2.patch, YARN-3051.wip.02.YARN-2928.patch, 
> YARN-3051.wip.patch, YARN-3051_temp.patch
>
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-06-19 Thread Joep Rottinghuis (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14594170#comment-14594170
 ] 

Joep Rottinghuis commented on YARN-3051:


When discussing with [~sjlee0] we noticed a couple of other items.
The isRelatedTo argument takes a set of Set
But TimelineEntity.getIsRelatedToEntities() returns a Map> 
getIsRelatedToEntities().
Presumably these correspond, but the Map at least enforces that each key (the 
entity type) occurs only once.

[~sjlee0] spotted a few other flaws that occur in multiple classes. I'll let 
him chime in.


> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: YARN-3051-YARN-2928.003.patch, 
> YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, 
> YARN-3051.Reader_API.patch, YARN-3051.Reader_API_1.patch, 
> YARN-3051.Reader_API_2.patch, YARN-3051.Reader_API_3.patch, 
> YARN-3051.wip.02.YARN-2928.patch, YARN-3051.wip.patch, YARN-3051_temp.patch
>
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-06-19 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14594180#comment-14594180
 ] 

Sangjin Lee commented on YARN-3051:
---

To clarify, I think as a rule it would be good for these arguments match (or 
follow closely) the types defined in {{TimelineEntity}}.

In its current form, if we used {{Set}} it would 
match *any* relationship. It might be better to qualify the match with the 
right type of relationship. Thoughts?

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: YARN-3051-YARN-2928.003.patch, 
> YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, 
> YARN-3051.Reader_API.patch, YARN-3051.Reader_API_1.patch, 
> YARN-3051.Reader_API_2.patch, YARN-3051.Reader_API_3.patch, 
> YARN-3051.wip.02.YARN-2928.patch, YARN-3051.wip.patch, YARN-3051_temp.patch
>
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-06-19 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14594192#comment-14594192
 ] 

Sangjin Lee commented on YARN-3051:
---

I'd like to discuss {{KeyValuePair}}. We're using {{Set}} for 
config and info. However, in {{TimelineEntity}}, we have 

{code}
info: HashMap
config: HashMap
{code}

Wouldn't it be better to use these types (i.e. maps v. sets) for info and 
config instead of using {{KeyValuePair}}? That would also naturally resolve any 
issues with duplicate keys, etc. The way it stands, since {{KeyValuePair}} does 
not override {{hashCode()}} or {{equals()}}, {{Set}} would allow 
entries with duplicate keys. I just think it'd be better to stick with the same 
types used by {{TimelineEntity}}.

BTW, we also noticed that neither {{TimelineEntity}} nor 
{{TimelineEntity.Identifier}} implements {{equals()}} or {{hashCode()}}. This 
will be problematic whenever we put them in a collection such as a set. We 
should define the equality semantics on them and add those methods for them to 
be used safely in a set or in a map as keys. I'll probably file a separate JIRA 
on this point. Thoughts?

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: YARN-3051-YARN-2928.003.patch, 
> YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, 
> YARN-3051.Reader_API.patch, YARN-3051.Reader_API_1.patch, 
> YARN-3051.Reader_API_2.patch, YARN-3051.Reader_API_3.patch, 
> YARN-3051.wip.02.YARN-2928.patch, YARN-3051.wip.patch, YARN-3051_temp.patch
>
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-06-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14594194#comment-14594194
 ] 

Hadoop QA commented on YARN-3051:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 45s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   7m 36s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 42s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m  0s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 32s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 32s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   0m  0s | The patch does not introduce 
any new Findbugs (version ) warnings. |
| | |  34m 35s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12740760/YARN-3051.Reader_API_3.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 49f5d20 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8290/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf907.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8290/console |


This message was automatically generated.

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: YARN-3051-YARN-2928.003.patch, 
> YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, 
> YARN-3051.Reader_API.patch, YARN-3051.Reader_API_1.patch, 
> YARN-3051.Reader_API_2.patch, YARN-3051.Reader_API_3.patch, 
> YARN-3051.wip.02.YARN-2928.patch, YARN-3051.wip.patch, YARN-3051_temp.patch
>
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-06-19 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14594209#comment-14594209
 ] 

Zhijie Shen commented on YARN-3051:
---

Good catch! I reverted it back to map. Set is the legacy from the v1 reader API.

bq. I'll probably file a separate JIRA on this point. Thoughts?

Yeah, please go ahead, not just for entity/identifier, but all data model 
objects.

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: YARN-3051-YARN-2928.003.patch, 
> YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, 
> YARN-3051.Reader_API.patch, YARN-3051.Reader_API_1.patch, 
> YARN-3051.Reader_API_2.patch, YARN-3051.Reader_API_3.patch, 
> YARN-3051.Reader_API_4.patch, YARN-3051.wip.02.YARN-2928.patch, 
> YARN-3051.wip.patch, YARN-3051_temp.patch
>
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-06-19 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14594224#comment-14594224
 ] 

Sangjin Lee commented on YARN-3051:
---

YARN-3836 added.

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: YARN-3051-YARN-2928.003.patch, 
> YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, 
> YARN-3051.Reader_API.patch, YARN-3051.Reader_API_1.patch, 
> YARN-3051.Reader_API_2.patch, YARN-3051.Reader_API_3.patch, 
> YARN-3051.Reader_API_4.patch, YARN-3051.wip.02.YARN-2928.patch, 
> YARN-3051.wip.patch, YARN-3051_temp.patch
>
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-06-19 Thread Joep Rottinghuis (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14594241#comment-14594241
 ] 

Joep Rottinghuis commented on YARN-3051:


Can somebody please remind me why TimelineEntity has HashMap 
for configs and not HashMap as in info?
o.a.h.c.Configuration can have things like Boolean, Double, BigDecimal etc. 
etc. right? We're not retaining that?
I think the HBaseWriterImpl has capabilities to simply serialize all these 
correctly through the GenericObjectMapper.

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: YARN-3051-YARN-2928.003.patch, 
> YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, 
> YARN-3051.Reader_API.patch, YARN-3051.Reader_API_1.patch, 
> YARN-3051.Reader_API_2.patch, YARN-3051.Reader_API_3.patch, 
> YARN-3051.Reader_API_4.patch, YARN-3051.wip.02.YARN-2928.patch, 
> YARN-3051.wip.patch, YARN-3051_temp.patch
>
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-06-19 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14594243#comment-14594243
 ] 

Sangjin Lee commented on YARN-3051:
---

Hadoop's {{Configuration}} is actually (string, string). Typed values are 
passed on as strings eventually to {{Configuration}}.

For example, the iterator for {{Configuration}}:

{code}
public Iterator> iterator();
{code}

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: YARN-3051-YARN-2928.003.patch, 
> YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, 
> YARN-3051.Reader_API.patch, YARN-3051.Reader_API_1.patch, 
> YARN-3051.Reader_API_2.patch, YARN-3051.Reader_API_3.patch, 
> YARN-3051.Reader_API_4.patch, YARN-3051.wip.02.YARN-2928.patch, 
> YARN-3051.wip.patch, YARN-3051_temp.patch
>
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-06-19 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14594249#comment-14594249
 ] 

Zhijie Shen commented on YARN-3051:
---

And if we process a bulk configs such as loading a config file, it's a bit 
difficult to assume we know the types of each config upfront.

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: YARN-3051-YARN-2928.003.patch, 
> YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, 
> YARN-3051.Reader_API.patch, YARN-3051.Reader_API_1.patch, 
> YARN-3051.Reader_API_2.patch, YARN-3051.Reader_API_3.patch, 
> YARN-3051.Reader_API_4.patch, YARN-3051.wip.02.YARN-2928.patch, 
> YARN-3051.wip.patch, YARN-3051_temp.patch
>
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-06-19 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14594410#comment-14594410
 ] 

Zhijie Shen commented on YARN-3051:
---

[~varun_saxena], would you please take over the reader API patch and move it 
forward, i.e., consolidating the comments, implementing FS-based reader, and 
wireup to web front, build the reader server and so on?

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: YARN-3051-YARN-2928.003.patch, 
> YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, 
> YARN-3051.Reader_API.patch, YARN-3051.Reader_API_1.patch, 
> YARN-3051.Reader_API_2.patch, YARN-3051.Reader_API_3.patch, 
> YARN-3051.Reader_API_4.patch, YARN-3051.wip.02.YARN-2928.patch, 
> YARN-3051.wip.patch, YARN-3051_temp.patch
>
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-06-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14594468#comment-14594468
 ] 

Hadoop QA commented on YARN-3051:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 53s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   7m 56s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m  2s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m  0s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 33s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   0m  0s | The patch does not introduce 
any new Findbugs (version ) warnings. |
| | |  35m 25s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12740776/YARN-3051.Reader_API_4.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 20c03c9 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8295/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8295/console |


This message was automatically generated.

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: YARN-3051-YARN-2928.003.patch, 
> YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, 
> YARN-3051.Reader_API.patch, YARN-3051.Reader_API_1.patch, 
> YARN-3051.Reader_API_2.patch, YARN-3051.Reader_API_3.patch, 
> YARN-3051.Reader_API_4.patch, YARN-3051.wip.02.YARN-2928.patch, 
> YARN-3051.wip.patch, YARN-3051_temp.patch
>
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-06-28 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14604932#comment-14604932
 ] 

Hadoop QA commented on YARN-3051:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  16m 56s | Findbugs (version ) appears to 
be broken on YARN-2928. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 42s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 44s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   1m 23s | The applied patch generated  3 
new checkstyle issues (total was 234, now 236). |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 38s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 40s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   3m 56s | The patch appears to introduce 1 
new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |   0m 25s | Tests passed in 
hadoop-yarn-api. |
| {color:green}+1{color} | yarn tests |   1m 57s | Tests passed in 
hadoop-yarn-common. |
| {color:green}+1{color} | yarn tests |   1m 18s | Tests passed in 
hadoop-yarn-server-timelineservice. |
| | |  46m 29s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-yarn-server-timelineservice |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12742423/YARN-3051-YARN-2928.05.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | YARN-2928 / 84f37f1 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-YARN-Build/8370/artifact/patchprocess/diffcheckstylehadoop-yarn-api.txt
 |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-YARN-Build/8370/artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-timelineservice.html
 |
| hadoop-yarn-api test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8370/artifact/patchprocess/testrun_hadoop-yarn-api.txt
 |
| hadoop-yarn-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8370/artifact/patchprocess/testrun_hadoop-yarn-common.txt
 |
| hadoop-yarn-server-timelineservice test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8370/artifact/patchprocess/testrun_hadoop-yarn-server-timelineservice.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8370/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8370/console |


This message was automatically generated.

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: YARN-3051-YARN-2928.003.patch, 
> YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, 
> YARN-3051-YARN-2928.05.patch, YARN-3051.Reader_API.patch, 
> YARN-3051.Reader_API_1.patch, YARN-3051.Reader_API_2.patch, 
> YARN-3051.Reader_API_3.patch, YARN-3051.Reader_API_4.patch, 
> YARN-3051.wip.02.YARN-2928.patch, YARN-3051.wip.patch, YARN-3051_temp.patch
>
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-06-29 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14605376#comment-14605376
 ] 

Varun Saxena commented on YARN-3051:


[~jrottinghuis], 
bq. In an earlier patch I saw NameValueRelation that was able to perform the 
operations. That again assumes that all values will be retrieved from the 
backing store, and then filtered in the reader before returned to the user. It 
will be more effective to make sure we can easily map this to operations we can 
push into HBase itself (through a ColumnValueFilter) through the available 
operations 
NameValueRelation would be used in metrics filters and will specify the metric 
name and relation to its value. The {{match}} function in it is not necessary 
to be used by the store implementation. It was added for use by FS based 
implementation.
{{RelationOp}} for instance, although wasn't intentional, directly maps to 
{{CompareFilter.CompareOp}}. So should not be too difficult to convert it into 
a HBase Filter by the backend implementation.
Currently only AND operations are supported. For support of OR operations we 
will handle it as part of other JIRAs'.

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: YARN-3051-YARN-2928.003.patch, 
> YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, 
> YARN-3051-YARN-2928.05.patch, YARN-3051.Reader_API.patch, 
> YARN-3051.Reader_API_1.patch, YARN-3051.Reader_API_2.patch, 
> YARN-3051.Reader_API_3.patch, YARN-3051.Reader_API_4.patch, 
> YARN-3051.wip.02.YARN-2928.patch, YARN-3051.wip.patch, YARN-3051_temp.patch
>
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-06-29 Thread Li Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14606458#comment-14606458
 ] 

Li Lu commented on YARN-3051:
-

Hi [~varun_saxena], thanks for the patch! I think this version is much closer 
and we're getting a reader-storage interface soon. The general approach looks 
fine, but I have one concern. I was looking at it when I found there are a lot 
of code related to the detailed filter design, like some binary relational 
operators. I'm not sure if in this JIRA we need to fix those filter designs, or 
we simply want to have some basic, name based filters, like "filtering out 
entities with metric HDFS_BYTES_WRITE". For detailed filter designs, we may 
need to consider our storage level implementations like our HBase 
implementation. After a general skim through the rest part of the patch I think 
they're fine, and I'll post detailed review of the rest part of code soon. 

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: YARN-3051-YARN-2928.003.patch, 
> YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, 
> YARN-3051-YARN-2928.05.patch, YARN-3051.Reader_API.patch, 
> YARN-3051.Reader_API_1.patch, YARN-3051.Reader_API_2.patch, 
> YARN-3051.Reader_API_3.patch, YARN-3051.Reader_API_4.patch, 
> YARN-3051.wip.02.YARN-2928.patch, YARN-3051.wip.patch, YARN-3051_temp.patch
>
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-06-29 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14606466#comment-14606466
 ] 

Varun Saxena commented on YARN-3051:


[~gtCarrera9], I included the relational operators because I had written it 
already. I had although raised another JIRA for filters. We probably need to 
provide support for OR (not only AND) operator as well. If you want I can move 
this relational operator part out of this JIRA and put it in there. And have 
simple metric filter(based on metric name).

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: YARN-3051-YARN-2928.003.patch, 
> YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, 
> YARN-3051-YARN-2928.05.patch, YARN-3051.Reader_API.patch, 
> YARN-3051.Reader_API_1.patch, YARN-3051.Reader_API_2.patch, 
> YARN-3051.Reader_API_3.patch, YARN-3051.Reader_API_4.patch, 
> YARN-3051.wip.02.YARN-2928.patch, YARN-3051.wip.patch, YARN-3051_temp.patch
>
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-06-29 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14606469#comment-14606469
 ] 

Varun Saxena commented on YARN-3051:


To be precise, YARN-3863 is meant for that. To make filters as close as 
possible to backend storage implementation(based on HBase Filters).

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: YARN-3051-YARN-2928.003.patch, 
> YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, 
> YARN-3051-YARN-2928.05.patch, YARN-3051.Reader_API.patch, 
> YARN-3051.Reader_API_1.patch, YARN-3051.Reader_API_2.patch, 
> YARN-3051.Reader_API_3.patch, YARN-3051.Reader_API_4.patch, 
> YARN-3051.wip.02.YARN-2928.patch, YARN-3051.wip.patch, YARN-3051_temp.patch
>
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-06-29 Thread Li Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14606510#comment-14606510
 ] 

Li Lu commented on YARN-3051:
-

OK, linked all related JIRAs to this one. Feel free to add more. 

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: YARN-3051-YARN-2928.003.patch, 
> YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, 
> YARN-3051-YARN-2928.05.patch, YARN-3051.Reader_API.patch, 
> YARN-3051.Reader_API_1.patch, YARN-3051.Reader_API_2.patch, 
> YARN-3051.Reader_API_3.patch, YARN-3051.Reader_API_4.patch, 
> YARN-3051.wip.02.YARN-2928.patch, YARN-3051.wip.patch, YARN-3051_temp.patch
>
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-06-29 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14606542#comment-14606542
 ] 

Zhijie Shen commented on YARN-3051:
---

Varun, thanks for take over the reader interface patch. I noticed there's a 
param difference: Map  metricFilters.  I'd like to 
recommend we don't support the binary relationship in the initial jira. My 
intial param means to filter the entities who contain the given metrics. We can 
file a separate jira to add binary relationship filtering, for metrics, config 
and info altogether. How do you think?

Here's a couple of comments about the patch:

1. Why do we need JsonSetter for the data model objects?

2. Can we prevent introducing the test oriented configurations into 
YarnConfiguration, which is part of api?

3. Is "getTimelineRecordFromJSON" required to be exposed. I'm a bit 
conservative to put the methods in api/common, which mean we need to keep 
supporting it.

4. Maybe {{Field}} is better to be the inner class of TimelineReader or 
TimelineEntity. Otherwise, the name  a bit vague about what it represents.

5. It seems to be better to implement FileSystemTimelineStorageImpl that 
implements both TimelineReader and TimelineWriter. One motivation is to reuse 
some code. A more critical problem is that FS reader and writer are not 
integrated:
a) Writer should not have written the mapping into APP_FLOW_MAPPING_FILE.
b) Currently, when updating an entity, a new entity json will be appended into 
the same file, but this reader impl assumes one entity per file. We need to 
sync the behavior between them.

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: YARN-3051-YARN-2928.003.patch, 
> YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, 
> YARN-3051-YARN-2928.05.patch, YARN-3051.Reader_API.patch, 
> YARN-3051.Reader_API_1.patch, YARN-3051.Reader_API_2.patch, 
> YARN-3051.Reader_API_3.patch, YARN-3051.Reader_API_4.patch, 
> YARN-3051.wip.02.YARN-2928.patch, YARN-3051.wip.patch, YARN-3051_temp.patch
>
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-06-29 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14606577#comment-14606577
 ] 

Varun Saxena commented on YARN-3051:


[~zjshen], thanks for looking at the patch.
bq.  I'd like to recommend we don't support the binary relationship in the 
initial jira. My intial param means to filter the entities who contain the 
given metrics
OK. will make the change. Let's move out this code to YARN-3863
One more thing I have changed is limit of entities by default have been kept 
has 100 instead of 1000. 1000 seemed too many. Thoughts ?

bq. Why do we need JsonSetter for the data model objects?
While reading back JSON dump from file, this is required.

bq. Maybe Field is better to be the inner class of TimelineReader or 
TimelineEntity. Otherwise, the name a bit vague about what it represents.
Ok.

bq. A more critical problem is that FS reader and writer are not integrated:
Was thinking of raising a new JIRA to integrate writer and reader 
implementations because its not directly related to Reader API JIRA. Will do so.

bq. It seems to be better to implement FileSystemTimelineStorageImpl that 
implements both TimelineReader and TimelineWriter. 
Thats a good suggestion.

bq. Currently, when updating an entity, a new entity json will be appended into 
the same file, but this reader impl assumes one entity per file
Ok, so the last entity entry should be the one returned ? Will check the writer 
side code and ask if any queries. While combining both FS writer and reader, we 
can decide the best possible option.

bq.  Is "getTimelineRecordFromJSON" required to be exposed.
Added in TimelineUtils because dumpTimelineRecordtoJSON used by FS Writer was 
also put in the same class. And the ObjectMapper used by both the methods is 
also initialized in that class. If we combine FS Writer and Reader into one 
class, probably can move both methods into that class. Isn't likely to be used 
outside FS implementation.

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: YARN-3051-YARN-2928.003.patch, 
> YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, 
> YARN-3051-YARN-2928.05.patch, YARN-3051.Reader_API.patch, 
> YARN-3051.Reader_API_1.patch, YARN-3051.Reader_API_2.patch, 
> YARN-3051.Reader_API_3.patch, YARN-3051.Reader_API_4.patch, 
> YARN-3051.wip.02.YARN-2928.patch, YARN-3051.wip.patch, YARN-3051_temp.patch
>
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-06-30 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14608774#comment-14608774
 ] 

Zhijie Shen commented on YARN-3051:
---

bq. One more thing I have changed is limit of entities by default have been 
kept has 100 instead of 1000. 1000 seemed too many. Thoughts ?

Sure, just noticed that the previous limit is 100 too.

bq. Ok, so the last entity entry should be the one returned ?

It's not that straightforward. For example, I can put entity 1 twice: one 
contains event 1 and the other contains event 2. In fact, when I want to 
retrieve the entity 1 with event field included. I actually want to have both 
events. I can see two choices: one is to merge the entity data at the write 
path and the other at the read path.

bq. Added in TimelineUtils because dumpTimelineRecordtoJSON used by FS Writer 
was also put in the same class.

That method is used by downstream project (e.g., tez) to logging/debugging the 
ATS integration. And this all getters of the data model objects are annotated. 
The method is applicable to all these objects. On the other side, we only 
annotate "jasonsetter" for TimelineEntity, such that getTimelineRecordFromJSON 
is not generalized enough for all purpose, but for FS impl only now. Maybe we 
can hold back the method and promote it to public api once we see real use case 
of it. Thoughts?

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: YARN-3051-YARN-2928.003.patch, 
> YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, 
> YARN-3051-YARN-2928.05.patch, YARN-3051.Reader_API.patch, 
> YARN-3051.Reader_API_1.patch, YARN-3051.Reader_API_2.patch, 
> YARN-3051.Reader_API_3.patch, YARN-3051.Reader_API_4.patch, 
> YARN-3051.wip.02.YARN-2928.patch, YARN-3051.wip.patch, YARN-3051_temp.patch
>
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-06-30 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14609172#comment-14609172
 ] 

Varun Saxena commented on YARN-3051:


bq. Maybe we can hold back the method and promote it to public api once we see 
real use case of it. 
Ok will move it back to FS Reader.

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: YARN-3051-YARN-2928.003.patch, 
> YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, 
> YARN-3051-YARN-2928.05.patch, YARN-3051.Reader_API.patch, 
> YARN-3051.Reader_API_1.patch, YARN-3051.Reader_API_2.patch, 
> YARN-3051.Reader_API_3.patch, YARN-3051.Reader_API_4.patch, 
> YARN-3051.wip.02.YARN-2928.patch, YARN-3051.wip.patch, YARN-3051_temp.patch
>
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-06-30 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14609177#comment-14609177
 ] 

Varun Saxena commented on YARN-3051:


bq. It's not that straightforward. 
Hmm. So its basically a union of everything. Will handle it. As of now on the 
read path.

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: YARN-3051-YARN-2928.003.patch, 
> YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, 
> YARN-3051-YARN-2928.05.patch, YARN-3051.Reader_API.patch, 
> YARN-3051.Reader_API_1.patch, YARN-3051.Reader_API_2.patch, 
> YARN-3051.Reader_API_3.patch, YARN-3051.Reader_API_4.patch, 
> YARN-3051.wip.02.YARN-2928.patch, YARN-3051.wip.patch, YARN-3051_temp.patch
>
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-06-30 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14609205#comment-14609205
 ] 

Sangjin Lee commented on YARN-3051:
---

bq. It's not that straightforward. For example, I can put entity 1 twice: one 
contains event 1 and the other contains event 2. In fact, when I want to 
retrieve the entity 1 with event field included. I actually want to have both 
events. I can see two choices: one is to merge the entity data at the write 
path and the other at the read path.

I think it would be easier (for the filesystem writer/reader) to do this on the 
read path.

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: YARN-3051-YARN-2928.003.patch, 
> YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, 
> YARN-3051-YARN-2928.05.patch, YARN-3051.Reader_API.patch, 
> YARN-3051.Reader_API_1.patch, YARN-3051.Reader_API_2.patch, 
> YARN-3051.Reader_API_3.patch, YARN-3051.Reader_API_4.patch, 
> YARN-3051.wip.02.YARN-2928.patch, YARN-3051.wip.patch, YARN-3051_temp.patch
>
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-07-01 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14610626#comment-14610626
 ] 

Varun Saxena commented on YARN-3051:


Any reason metrics and events in TimelineEntity are stored in a set ? A map 
will make some operations easier and optimal in case of FS implementation

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: YARN-3051-YARN-2928.003.patch, 
> YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, 
> YARN-3051-YARN-2928.05.patch, YARN-3051-YARN-2928.06.patch, 
> YARN-3051.Reader_API.patch, YARN-3051.Reader_API_1.patch, 
> YARN-3051.Reader_API_2.patch, YARN-3051.Reader_API_3.patch, 
> YARN-3051.Reader_API_4.patch, YARN-3051.wip.02.YARN-2928.patch, 
> YARN-3051.wip.patch, YARN-3051_temp.patch
>
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-07-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14610700#comment-14610700
 ] 

Hadoop QA commented on YARN-3051:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  16m 37s | Findbugs (version ) appears to 
be broken on YARN-2928. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 47s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 43s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   1m  6s | The applied patch generated  3 
new checkstyle issues (total was 234, now 236). |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 35s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 37s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   2m 24s | The patch appears to introduce 1 
new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |   0m 25s | Tests passed in 
hadoop-yarn-api. |
| {color:red}-1{color} | yarn tests |   7m 52s | Tests failed in 
hadoop-yarn-server-timelineservice. |
| | |  48m 43s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-yarn-server-timelineservice |
| Failed unit tests | 
hadoop.yarn.server.timelineservice.storage.TestHBaseTimelineWriterImpl |
|   | hadoop.yarn.server.timelineservice.storage.TestPhoenixTimelineWriterImpl |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12743118/YARN-3051-YARN-2928.06.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | YARN-2928 / 18c4859 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-YARN-Build/8409/artifact/patchprocess/diffcheckstylehadoop-yarn-api.txt
 |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-YARN-Build/8409/artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-timelineservice.html
 |
| hadoop-yarn-api test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8409/artifact/patchprocess/testrun_hadoop-yarn-api.txt
 |
| hadoop-yarn-server-timelineservice test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8409/artifact/patchprocess/testrun_hadoop-yarn-server-timelineservice.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8409/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf901.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8409/console |


This message was automatically generated.

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: YARN-3051-YARN-2928.003.patch, 
> YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, 
> YARN-3051-YARN-2928.05.patch, YARN-3051-YARN-2928.06.patch, 
> YARN-3051.Reader_API.patch, YARN-3051.Reader_API_1.patch, 
> YARN-3051.Reader_API_2.patch, YARN-3051.Reader_API_3.patch, 
> YARN-3051.Reader_API_4.patch, YARN-3051.wip.02.YARN-2928.patch, 
> YARN-3051.wip.patch, YARN-3051_temp.patch
>
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-07-01 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14611156#comment-14611156
 ] 

Zhijie Shen commented on YARN-3051:
---

[~varun_saxena], thanks for being patient about the comments. Here're some more 
about about the new patch.

1. We should compare the objects directly instead of converting them to String 
first.
{code}
136   private static boolean matchFilter(Object infoValue, Object 
filterValue) {
137 // Convert to String and check for now.
138 return infoValue.toString().equals(filterValue.toString());
139   }
{code}

2. No one is writing the mapping into APP_FLOW_MAPPING_FILE in the current code 
base? Are you suggesting treating it as a property file? What's the rationale? 
How about using CSV format: 1) searching for user/flowId/flowRunId separately 
2) being neutral about path separator.
{code}
59  prop.setProperty("app1", "user1/flow1/1");
{code}

3. Can we prevent introducing the test oriented configurations into 
YarnConfiguration?

4. We can do some optimization for the file implementation, such putting 
created time and modified time into file name to quickly filter these files 
without reading them, merging the entities and overwriting the file to prevent 
merging again for each query. But that's not critical here, we can do it later.

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: YARN-3051-YARN-2928.003.patch, 
> YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, 
> YARN-3051-YARN-2928.05.patch, YARN-3051-YARN-2928.06.patch, 
> YARN-3051.Reader_API.patch, YARN-3051.Reader_API_1.patch, 
> YARN-3051.Reader_API_2.patch, YARN-3051.Reader_API_3.patch, 
> YARN-3051.Reader_API_4.patch, YARN-3051.wip.02.YARN-2928.patch, 
> YARN-3051.wip.patch, YARN-3051_temp.patch
>
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-07-01 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14611196#comment-14611196
 ] 

Varun Saxena commented on YARN-3051:


Thanks for the review [~zjshen].

bq. 1. We should compare the objects directly instead of converting them to 
String first.
Correct.

bq. 2. No one is writing the mapping into APP_FLOW_MAPPING_FILE in the current 
code base? 
Yes nobody had handled mapping between app to flow in writer code path. So came 
up with this solution. We can write CSV as well.
Will write the code for writer as well when I combine the FS Writer and reader 
classes.
Regarding, searching for user/flowId/flowRunId separately, you mean store them 
in separate files ?

bq. 3. Can we prevent introducing the test oriented configurations into 
YarnConfiguration?
You mean the config about fs storage root directory 
(TIMELINE_SERVICE_STORAGE_DIR_ROOT) ? This is used by writer as well and is 
expected to be a configuration hence moved it to YarnConfiguration. We do not 
want it as a configuration ? The config name I plan to change but that would 
require change in writer too.

bq. We can do some optimization for the file implementation, such putting 
created time and modified time into file name to quickly filter these files 
without reading them, merging the entities and overwriting the file to prevent 
merging again for each query. But that's not critical here, we can do it later.
These are good suggestions. First one even I had thought but then getting in a 
working patch took priority. Anyways will handle when I merge FS reader and 
writer.


> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: YARN-3051-YARN-2928.003.patch, 
> YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, 
> YARN-3051-YARN-2928.05.patch, YARN-3051-YARN-2928.06.patch, 
> YARN-3051.Reader_API.patch, YARN-3051.Reader_API_1.patch, 
> YARN-3051.Reader_API_2.patch, YARN-3051.Reader_API_3.patch, 
> YARN-3051.Reader_API_4.patch, YARN-3051.wip.02.YARN-2928.patch, 
> YARN-3051.wip.patch, YARN-3051_temp.patch
>
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-07-02 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14612113#comment-14612113
 ] 

Zhijie Shen commented on YARN-3051:
---

2. I meant we store  in a CSV file. Thoughts?

3. I think FS impl related config shouldn't be put in api as the impl not 
supposed to be used by public, but for test purpose.

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: YARN-3051-YARN-2928.003.patch, 
> YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, 
> YARN-3051-YARN-2928.05.patch, YARN-3051-YARN-2928.06.patch, 
> YARN-3051.Reader_API.patch, YARN-3051.Reader_API_1.patch, 
> YARN-3051.Reader_API_2.patch, YARN-3051.Reader_API_3.patch, 
> YARN-3051.Reader_API_4.patch, YARN-3051.wip.02.YARN-2928.patch, 
> YARN-3051.wip.patch, YARN-3051_temp.patch
>
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-07-02 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14612117#comment-14612117
 ] 

Varun Saxena commented on YARN-3051:


Ok...Will make the change

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: YARN-3051-YARN-2928.003.patch, 
> YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, 
> YARN-3051-YARN-2928.05.patch, YARN-3051-YARN-2928.06.patch, 
> YARN-3051.Reader_API.patch, YARN-3051.Reader_API_1.patch, 
> YARN-3051.Reader_API_2.patch, YARN-3051.Reader_API_3.patch, 
> YARN-3051.Reader_API_4.patch, YARN-3051.wip.02.YARN-2928.patch, 
> YARN-3051.wip.patch, YARN-3051_temp.patch
>
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-07-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14615126#comment-14615126
 ] 

Hadoop QA commented on YARN-3051:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  17m 22s | Findbugs (version ) appears to 
be broken on YARN-2928. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   8m 10s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m 11s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   1m 18s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 39s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 41s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   2m 23s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |   0m 24s | Tests passed in 
hadoop-yarn-api. |
| {color:green}+1{color} | yarn tests |   1m 19s | Tests passed in 
hadoop-yarn-server-timelineservice. |
| | |  43m 56s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12743727/YARN-3051-YARN-2928.07.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | YARN-2928 / 18c4859 |
| hadoop-yarn-api test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8439/artifact/patchprocess/testrun_hadoop-yarn-api.txt
 |
| hadoop-yarn-server-timelineservice test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8439/artifact/patchprocess/testrun_hadoop-yarn-server-timelineservice.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8439/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8439/console |


This message was automatically generated.

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: YARN-3051-YARN-2928.003.patch, 
> YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, 
> YARN-3051-YARN-2928.05.patch, YARN-3051-YARN-2928.06.patch, 
> YARN-3051-YARN-2928.07.patch, YARN-3051.Reader_API.patch, 
> YARN-3051.Reader_API_1.patch, YARN-3051.Reader_API_2.patch, 
> YARN-3051.Reader_API_3.patch, YARN-3051.Reader_API_4.patch, 
> YARN-3051.wip.02.YARN-2928.patch, YARN-3051.wip.patch, YARN-3051_temp.patch
>
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-07-06 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14615295#comment-14615295
 ] 

Varun Saxena commented on YARN-3051:


[~zjshen], [~sjlee0], kindly review.

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: YARN-3051-YARN-2928.003.patch, 
> YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, 
> YARN-3051-YARN-2928.05.patch, YARN-3051-YARN-2928.06.patch, 
> YARN-3051-YARN-2928.07.patch, YARN-3051.Reader_API.patch, 
> YARN-3051.Reader_API_1.patch, YARN-3051.Reader_API_2.patch, 
> YARN-3051.Reader_API_3.patch, YARN-3051.Reader_API_4.patch, 
> YARN-3051.wip.02.YARN-2928.patch, YARN-3051.wip.patch, YARN-3051_temp.patch
>
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-07-06 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14615354#comment-14615354
 ] 

Sangjin Lee commented on YARN-3051:
---

Thanks [~varun_saxena] for providing a quick update! My latest comments are 
mostly on FileSystemTimelineReaderImpl.java.

- l.151-152: [~zjshen] previously pointed this out but I don't see this changed 
in the latest patch. Do info values have to be converted into strings to be 
compared for equality? Is it because you worry about the info value types not 
implementing equals()? Can we not assume that it is expected for the info value 
types to provide sensible equals() implementations?
- l.192: How do you deal with a situation where "," is used in the tokens 
themselves? Note that flow names may contain commas (there is no reason they 
cannot). The separators should be escaped on the way in and unescaped on the 
way out. And it'd be good to have some unit tests for this case.
- l.220: matchMetricFilters() is static while matchEventFilters() is not. Could 
you make it consistent across all private helper methods?
- l.249: nit: it can be a simple return statement instead of the if clause.


> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: YARN-3051-YARN-2928.003.patch, 
> YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, 
> YARN-3051-YARN-2928.05.patch, YARN-3051-YARN-2928.06.patch, 
> YARN-3051-YARN-2928.07.patch, YARN-3051.Reader_API.patch, 
> YARN-3051.Reader_API_1.patch, YARN-3051.Reader_API_2.patch, 
> YARN-3051.Reader_API_3.patch, YARN-3051.Reader_API_4.patch, 
> YARN-3051.wip.02.YARN-2928.patch, YARN-3051.wip.patch, YARN-3051_temp.patch
>
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-07-06 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14615363#comment-14615363
 ] 

Varun Saxena commented on YARN-3051:


bq.  Do info values have to be converted into strings to be compared for 
equality? 
Sorry missed this change.

bq. How do you deal with a situation where "," is used in the tokens 
themselves? 
Wasn't expecting commas in flow. Will handle it.

bq. matchMetricFilters() is static while matchEventFilters() is not. Could you 
make it consistent across all private helper methods?
Ok. Missed it.

bq. it can be a simple return statement instead of the if clause.
Ok

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: YARN-3051-YARN-2928.003.patch, 
> YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, 
> YARN-3051-YARN-2928.05.patch, YARN-3051-YARN-2928.06.patch, 
> YARN-3051-YARN-2928.07.patch, YARN-3051.Reader_API.patch, 
> YARN-3051.Reader_API_1.patch, YARN-3051.Reader_API_2.patch, 
> YARN-3051.Reader_API_3.patch, YARN-3051.Reader_API_4.patch, 
> YARN-3051.wip.02.YARN-2928.patch, YARN-3051.wip.patch, YARN-3051_temp.patch
>
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-07-06 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14615387#comment-14615387
 ] 

Zhijie Shen commented on YARN-3051:
---

How about we using common csv lib to handle the lookup file?

http://commons.apache.org/proper/commons-csv/index.html

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: YARN-3051-YARN-2928.003.patch, 
> YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, 
> YARN-3051-YARN-2928.05.patch, YARN-3051-YARN-2928.06.patch, 
> YARN-3051-YARN-2928.07.patch, YARN-3051.Reader_API.patch, 
> YARN-3051.Reader_API_1.patch, YARN-3051.Reader_API_2.patch, 
> YARN-3051.Reader_API_3.patch, YARN-3051.Reader_API_4.patch, 
> YARN-3051.wip.02.YARN-2928.patch, YARN-3051.wip.patch, YARN-3051_temp.patch
>
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-07-06 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14615390#comment-14615390
 ] 

Varun Saxena commented on YARN-3051:


Oh we have an Apache Lib for it. Will use it.
Thanks [~zjshen]

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: YARN-3051-YARN-2928.003.patch, 
> YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, 
> YARN-3051-YARN-2928.05.patch, YARN-3051-YARN-2928.06.patch, 
> YARN-3051-YARN-2928.07.patch, YARN-3051.Reader_API.patch, 
> YARN-3051.Reader_API_1.patch, YARN-3051.Reader_API_2.patch, 
> YARN-3051.Reader_API_3.patch, YARN-3051.Reader_API_4.patch, 
> YARN-3051.wip.02.YARN-2928.patch, YARN-3051.wip.patch, YARN-3051_temp.patch
>
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-07-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14615741#comment-14615741
 ] 

Hadoop QA commented on YARN-3051:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  18m 30s | Findbugs (version ) appears to 
be broken on YARN-2928. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   8m 35s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m 33s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 24s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   1m 22s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 48s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 46s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   2m 31s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |   0m 25s | Tests passed in 
hadoop-yarn-api. |
| {color:green}+1{color} | yarn tests |   1m 21s | Tests passed in 
hadoop-yarn-server-timelineservice. |
| | |  46m 23s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12743797/YARN-3051-YARN-2928.08.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | YARN-2928 / 6837552 |
| hadoop-yarn-api test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8443/artifact/patchprocess/testrun_hadoop-yarn-api.txt
 |
| hadoop-yarn-server-timelineservice test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8443/artifact/patchprocess/testrun_hadoop-yarn-server-timelineservice.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8443/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf902.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8443/console |


This message was automatically generated.

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: YARN-3051-YARN-2928.003.patch, 
> YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, 
> YARN-3051-YARN-2928.05.patch, YARN-3051-YARN-2928.06.patch, 
> YARN-3051-YARN-2928.07.patch, YARN-3051-YARN-2928.08.patch, 
> YARN-3051.Reader_API.patch, YARN-3051.Reader_API_1.patch, 
> YARN-3051.Reader_API_2.patch, YARN-3051.Reader_API_3.patch, 
> YARN-3051.Reader_API_4.patch, YARN-3051.wip.02.YARN-2928.patch, 
> YARN-3051.wip.patch, YARN-3051_temp.patch
>
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-07-06 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14615747#comment-14615747
 ] 

Zhijie Shen commented on YARN-3051:
---

Hi Varun, thanks for updating the patch. I have only one remaining issue about 
this patch:

According to 
https://issues.apache.org/jira/secure/attachment/12743391/hbase-schema-proposal-for-aggregation.pdf.
 It seems that we have chosen clusterId + appId to globally find a unique flow 
run. I think here we should do it similar by adding clusterId, which 's 
mandatory field. /cc [~sjlee0].

Some other improvement that are required in the future to improve robustness 
and performance. Let's make sure we have a jira to improve the reader later.

1. Maybe we want to cache the mapping instead of reading it from the file for 
every query.
2. limit should be push down into the for loop. It's unnecessary that if we 
want to just retrieve 10 entities, we will have to go through 1000 qualified 
candidates and finally pick the top 10.
3. We'd better avoid hard code "/" as the path separator, and we should use 
FileSystem interface to operate the files, such that the impl can also work 
with HDFS.

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: YARN-3051-YARN-2928.003.patch, 
> YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, 
> YARN-3051-YARN-2928.05.patch, YARN-3051-YARN-2928.06.patch, 
> YARN-3051-YARN-2928.07.patch, YARN-3051-YARN-2928.08.patch, 
> YARN-3051.Reader_API.patch, YARN-3051.Reader_API_1.patch, 
> YARN-3051.Reader_API_2.patch, YARN-3051.Reader_API_3.patch, 
> YARN-3051.Reader_API_4.patch, YARN-3051.wip.02.YARN-2928.patch, 
> YARN-3051.wip.patch, YARN-3051_temp.patch
>
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-07-06 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14615776#comment-14615776
 ] 

Varun Saxena commented on YARN-3051:


[~zjshen],
bq. we have chosen clusterId + appId to globally find a unique flow run. I 
think here we should do it similar by adding clusterId
The current FS implementation had cluster as part of the path. So there will a 
app_flow_mapping.csv for each cluster. So in a way it is part of the primary 
key even though its not there in app_flow_mapping.csv
I hope that is what your concern was.

bq. 1. Maybe we want to cache the mapping instead of reading it from the file 
for every query.
Yes, we should be doing so. Plan to do these optimizations in later JIRA. Also 
some optimizations are required as in we are using set instead of map for 
storing metrics and events. So I have to iterate over all of them. Any issue in 
turning them into map ?

bq. 2. limit should be push down into the for loop. It's unnecessary that if we 
want to just retrieve.
The issue here is that we want to have limit on entities but these should be 
latest entities(sorted descendingly by created time). Having created time in 
entity file name will help towards not reading all the files.

bq.3. We'd better avoid hard code "/" as the path separator, and we should use 
FileSystem interface to operate the files, such that the impl can also work 
with HDFS.
Ok.

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: YARN-3051-YARN-2928.003.patch, 
> YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, 
> YARN-3051-YARN-2928.05.patch, YARN-3051-YARN-2928.06.patch, 
> YARN-3051-YARN-2928.07.patch, YARN-3051-YARN-2928.08.patch, 
> YARN-3051.Reader_API.patch, YARN-3051.Reader_API_1.patch, 
> YARN-3051.Reader_API_2.patch, YARN-3051.Reader_API_3.patch, 
> YARN-3051.Reader_API_4.patch, YARN-3051.wip.02.YARN-2928.patch, 
> YARN-3051.wip.patch, YARN-3051_temp.patch
>
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-07-06 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14615792#comment-14615792
 ] 

Zhijie Shen commented on YARN-3051:
---

bq. The current FS implementation had cluster as part of the path. So there 
will a app_flow_mapping.csv for each cluster. So in a way it is part of the 
primary key even though its not there in app_flow_mapping.csv
I hope that is what your concern was.

The problem is about write path. Suppose we unfortunately have the duplicate 
appId: one is clusterId1/appId and the other is clusterId2/appId. When the 
former entity is written, you have added appId into the mapping file. How do 
you write the mapping file upon cluster2/appId? Overwriting the row of appId? 
Appending one more row of appId? Both will trouble you when finding the right 
flow info when the query has default values.

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: YARN-3051-YARN-2928.003.patch, 
> YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, 
> YARN-3051-YARN-2928.05.patch, YARN-3051-YARN-2928.06.patch, 
> YARN-3051-YARN-2928.07.patch, YARN-3051-YARN-2928.08.patch, 
> YARN-3051.Reader_API.patch, YARN-3051.Reader_API_1.patch, 
> YARN-3051.Reader_API_2.patch, YARN-3051.Reader_API_3.patch, 
> YARN-3051.Reader_API_4.patch, YARN-3051.wip.02.YARN-2928.patch, 
> YARN-3051.wip.patch, YARN-3051_temp.patch
>
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-07-06 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14615818#comment-14615818
 ] 

Varun Saxena commented on YARN-3051:


bq. Overwriting the row of appId? Appending one more row of appId? 
No. cluster1 will have a different directory and cluster2 a different one.
I mean if default root directory is {{/tmp/timeline_service_data}} and 2 
cluster ids', we will have one app flow mapping file at location 
{{/tmp/timeline_service_data/cluster1/app_flow_mapping.csv}} and other one will 
be {{/tmp/timeline_service_data/cluster2/app_flow_mapping.csv}}

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: YARN-3051-YARN-2928.003.patch, 
> YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, 
> YARN-3051-YARN-2928.05.patch, YARN-3051-YARN-2928.06.patch, 
> YARN-3051-YARN-2928.07.patch, YARN-3051-YARN-2928.08.patch, 
> YARN-3051.Reader_API.patch, YARN-3051.Reader_API_1.patch, 
> YARN-3051.Reader_API_2.patch, YARN-3051.Reader_API_3.patch, 
> YARN-3051.Reader_API_4.patch, YARN-3051.wip.02.YARN-2928.patch, 
> YARN-3051.wip.patch, YARN-3051_temp.patch
>
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-07-06 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14615825#comment-14615825
 ] 

Varun Saxena commented on YARN-3051:


Is this approach fine or you prefer having a single app flow mapping file. I 
segregated it with the intention of reducing the number of records to read as 
well. But that will be less of a concern once we cache it.

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: YARN-3051-YARN-2928.003.patch, 
> YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, 
> YARN-3051-YARN-2928.05.patch, YARN-3051-YARN-2928.06.patch, 
> YARN-3051-YARN-2928.07.patch, YARN-3051-YARN-2928.08.patch, 
> YARN-3051.Reader_API.patch, YARN-3051.Reader_API_1.patch, 
> YARN-3051.Reader_API_2.patch, YARN-3051.Reader_API_3.patch, 
> YARN-3051.Reader_API_4.patch, YARN-3051.wip.02.YARN-2928.patch, 
> YARN-3051.wip.patch, YARN-3051_temp.patch
>
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-07-06 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14615839#comment-14615839
 ] 

Zhijie Shen commented on YARN-3051:
---

Okay, then it seems to be fine. I didn't notice it's per cluster based mapping 
file.

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: YARN-3051-YARN-2928.003.patch, 
> YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, 
> YARN-3051-YARN-2928.05.patch, YARN-3051-YARN-2928.06.patch, 
> YARN-3051-YARN-2928.07.patch, YARN-3051-YARN-2928.08.patch, 
> YARN-3051.Reader_API.patch, YARN-3051.Reader_API_1.patch, 
> YARN-3051.Reader_API_2.patch, YARN-3051.Reader_API_3.patch, 
> YARN-3051.Reader_API_4.patch, YARN-3051.wip.02.YARN-2928.patch, 
> YARN-3051.wip.patch, YARN-3051_temp.patch
>
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-07-06 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14615930#comment-14615930
 ] 

Zhijie Shen commented on YARN-3051:
---

Will commit the patch late today if no more comments.

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: YARN-3051-YARN-2928.003.patch, 
> YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, 
> YARN-3051-YARN-2928.05.patch, YARN-3051-YARN-2928.06.patch, 
> YARN-3051-YARN-2928.07.patch, YARN-3051-YARN-2928.08.patch, 
> YARN-3051.Reader_API.patch, YARN-3051.Reader_API_1.patch, 
> YARN-3051.Reader_API_2.patch, YARN-3051.Reader_API_3.patch, 
> YARN-3051.Reader_API_4.patch, YARN-3051.wip.02.YARN-2928.patch, 
> YARN-3051.wip.patch, YARN-3051_temp.patch
>
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-07-06 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14615941#comment-14615941
 ] 

Sangjin Lee commented on YARN-3051:
---

+1 from me. Thanks!

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: YARN-3051-YARN-2928.003.patch, 
> YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, 
> YARN-3051-YARN-2928.05.patch, YARN-3051-YARN-2928.06.patch, 
> YARN-3051-YARN-2928.07.patch, YARN-3051-YARN-2928.08.patch, 
> YARN-3051.Reader_API.patch, YARN-3051.Reader_API_1.patch, 
> YARN-3051.Reader_API_2.patch, YARN-3051.Reader_API_3.patch, 
> YARN-3051.Reader_API_4.patch, YARN-3051.wip.02.YARN-2928.patch, 
> YARN-3051.wip.patch, YARN-3051_temp.patch
>
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-07-06 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14616218#comment-14616218
 ] 

Varun Saxena commented on YARN-3051:


Thanks [~zjshen] for the commit.
Thanks [~zjshen], [~sjlee0], [~gtCarrera9] and [~jrottinghuis] for the review.

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Fix For: YARN-2928
>
> Attachments: YARN-3051-YARN-2928.003.patch, 
> YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, 
> YARN-3051-YARN-2928.05.patch, YARN-3051-YARN-2928.06.patch, 
> YARN-3051-YARN-2928.07.patch, YARN-3051-YARN-2928.08.patch, 
> YARN-3051.Reader_API.patch, YARN-3051.Reader_API_1.patch, 
> YARN-3051.Reader_API_2.patch, YARN-3051.Reader_API_3.patch, 
> YARN-3051.Reader_API_4.patch, YARN-3051.wip.02.YARN-2928.patch, 
> YARN-3051.wip.patch, YARN-3051_temp.patch
>
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-03-27 Thread Li Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384389#comment-14384389
 ] 

Li Lu commented on YARN-3051:
-

Hi [~varun_saxena], any progress on the reader API side for now? The new reader 
API is blocking our storage implementations, so if you have any bandwidth 
problems feel free to let us know. I can take it over if necessary. 

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-03-27 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384422#comment-14384422
 ] 

Varun Saxena commented on YARN-3051:


I am relatively free this weekend. So will be able to work on this on priority. 
Will let you know if I run into bandwidth issues.

We had decided on below three APIs' which are somewhat similar to what existed 
in ATS v1.

Now, as you mentioned in comment elsewhere we need to support metrics too. So, 
what kind of queries have we decided to support ? For instance, queries such as 
get apps which have a particular metric's value less than or greater than 
something ?

{code}
  TimelineEntities getEntities(String entityType, long limit,
  long windowStart, Long windowEnd, String fromId, long fromTs,
  Collection filters,
  EnumSet fieldsToRetrieve) throws IOException;

  TimelineEntity getEntity(String entityId, String entityType,
  EnumSet fieldsToRetrieve) throws IOException;

  TimelineEvents getEntityTimelines(String entityType,
  SortedSet entityIds, long limit, long windowStart,
  long windowEnd, Set eventTypes) throws IOException;
{code}

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-03-27 Thread Li Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384566#comment-14384566
 ] 

Li Lu commented on YARN-3051:
-

bq. We had decided on below three APIs' which are somewhat similar to what 
existed in ATS v1.
Isn't that what we already have in YARN-3047? 

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-03-27 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384578#comment-14384578
 ] 

Varun Saxena commented on YARN-3051:


No...I had initially kept it there but later moved it out so that store 
implementation can be in YARN-3051. This JIRA will have File System 
implementation.

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-03-27 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384615#comment-14384615
 ] 

Sangjin Lee commented on YARN-3051:
---

A couple of things to discuss:

In principle, a *shallow* view of the entity will be returned by default, 
right? Specifically, I'm wondering whether all configs and metrics should be 
included in the default view or not. I wonder what ATS v.1 does in this regard? 
FYI, I believe most of the YARN REST API returns a shall view of objects. Note 
that the size of the responses could become quite big if we include configs and 
metrics by default.

On a related note, if we decide to return shallow views by default, then the 
question is, how do we ask the reader to get things like configs and metrics? 
The reader API as well as the reader storage interface should be able to 
support calls to retrieve config/metrics, perhaps with new methods.

bq. For instance, queries such as get apps which have a particular metric's 
value less than or greater than something ?

Metric/config-based queries will probably need changes to the API. We would 
want to be able to queries like "return apps where config X = Y" or "return 
apps where metric A > B". But we can consider them advanced queries.

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-03-27 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384634#comment-14384634
 ] 

Varun Saxena commented on YARN-3051:


That is what was initially decided. We can handle file system implementation in 
another JIRA as well. But as File System implementation will be the default, we 
thought we can handle it here

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-03-27 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384656#comment-14384656
 ] 

Varun Saxena commented on YARN-3051:


configs and metrics will be retrieved as part of an entity.
We can filter out which fields to retrieve based on {{EnumSet 
fieldsToRetrieve}}. null means all fields will be retrieved.
So if we do not want all configs and metrics, we can leave them out and mention 
other fields in fieldsToRetrieve. This can be mentioned in the REST URL as 
{{fields=}}
 

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-03-27 Thread Li Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384675#comment-14384675
 ] 

Li Lu commented on YARN-3051:
-

bq. configs and metrics will be retrieved as part of an entity.
The most significant concern here is the size of configs and metrics. I think 
that's why [~sjlee0] is proposing a shallow view here. Still waiting for 
[~zjshen]'s confirmation for v1, but for v2 I think we may need something like 
this. 

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-03-27 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384684#comment-14384684
 ] 

Varun Saxena commented on YARN-3051:


Keeping this in mind, do you think a new method will be required to fetch 
config and metrics ? I guess not.

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-03-27 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384699#comment-14384699
 ] 

Varun Saxena commented on YARN-3051:


To elaborate on what getEntities API will do.

It will support filters similar to secondary filters by matching the info 
field. Yes, API would need to be enhanced to support queries based on config 
and metrics. I think it can be part of the same getEntities API.

As mentioned above, for config equality can be checked and for metrics all the 
relational operators will have to be supported.
We can probably have 2 additional parameters in the API, namely configFilters 
and metricsFilters. I guess that should do. 

I dont think there will be any other field on the basis of which filtering will 
be done.

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-03-27 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384698#comment-14384698
 ] 

Varun Saxena commented on YARN-3051:


Yeah I meant it can still be supported if client mentions which fields are to 
be retrieved.

But I do understand the concern here. The default view should return all fields 
except configs and metrics.

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-03-27 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384709#comment-14384709
 ] 

Varun Saxena commented on YARN-3051:


For the point about shallow view of entity, we can then say if 
{{fieldsToRetrieve}} is null i.e. client does not specify which fields to 
retrieve, store implementation will return all fields except configs and 
metrics.
I can add another special field called "all" which would indicate all fields 
will have to be retrieved. So if client specifies fields=all in REST URL, 
storage implementation will fetch all the fields. Thoughts ?

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-03-27 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384761#comment-14384761
 ] 

Varun Saxena commented on YARN-3051:


bq. We may not only need to do queries for timeline entities, but also 
something solely for their configs and/or metrics
But IIUC, metrics and configs would still be tied to or encapsulated inside an 
entity. The entity may be a cluster or it may be an application or something 
else.
So when I say get all configs for an app. I do that by specifying 
fields=configs in REST URL. And if I want metrics and configs for an app, I can 
say fields=configs,metrics.
 

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-03-27 Thread Li Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384783#comment-14384783
 ] 

Li Lu commented on YARN-3051:
-

bq. So when I say get all configs for an app. I do that by specifying 
fields=configs in REST URL. And if I want metrics and configs for an app, I can 
say fields=configs,metrics.

OK, I'm just thinking out loud. 
So do we need to touch both the entity table and the config/metric table on the 
underlying storage? Now suppose I've already have a timeline entity, without 
its metrics, and I'd like to draw a time series for its hdfs_bytes_write. Do I 
need to regenerate the timeline entity together with the metric, or I can say 
something like "get hdfs_bytes_write for this context"? 

BTW, we may want to consider the relationship between the context and timeline 
entities on the reader side. The context information is the PK of the timeline 
entity rows. 

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-03-27 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384785#comment-14384785
 ] 

Varun Saxena commented on YARN-3051:


Just to elaborate further, below API will be used to serve the use case above.
{code}
  TimelineEntity getEntity(String entityId, String entityType,
  EnumSet fieldsToRetrieve)
{code}
Assuming entityid will be same as appid if entity type is "application", we can 
fetch configs for application_12345_0001 like below :
{{http:///application/application_12345_0001?fields=configs}}

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-03-27 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384788#comment-14384788
 ] 

Varun Saxena commented on YARN-3051:


Hmm...If you don't mind can you share the schema decided for phoenix based 
storage. That will be helpful in designing the API.

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-03-27 Thread Li Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384792#comment-14384792
 ] 

Li Lu commented on YARN-3051:
-

Sure. Will post it soon. 

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-03-27 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384797#comment-14384797
 ] 

Varun Saxena commented on YARN-3051:


Thanks.

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-03-27 Thread Li Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384796#comment-14384796
 ] 

Li Lu commented on YARN-3051:
-

BTW, the reader APIs are not only for the Phoenix storage itself. We also need 
to consider the hbase implementation. On the design side, we may want to 
consider the common strategies, and I don't think a single storage 
implementation would block the progress of this JIRA. 

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-03-27 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384802#comment-14384802
 ] 

Varun Saxena commented on YARN-3051:


Yeah it should not block progress of this JIRA. Was just trying to understand 
your use case better.

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-03-27 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14385025#comment-14385025
 ] 

Sangjin Lee commented on YARN-3051:
---

Wanted to add my 2 cents before the weekend.

As for the default view of an entity, are we agreed then that it means all the 
data at the level of the entity but *not* going into config/metrics/info? I 
want to stress that this default behavior should be explicit in the code so 
there is no confusion.

I think it's up to us to define in terms of APIs how to best capture all the 
query use cases. If it can be worked through fieldsToRetrieve, that is fine. We 
need to make sure the APIs are clear in terms of what they do.

The following are the types of queries that I can think of this storage reader 
API (and the reader itself) would need to support. This is not an exhaustive 
list. There may be more. But at least these need to be supported well:
- given an id, return the entity (default; see above)
- given an id, return all metrics of the entity
- given an id, return the entire config of the entity
- given an id, return the entity along with metrics/configs/info
- (optional?) given an id, return one metric or some metrics (by name) of the 
entity (possibly retrieving the time series of its values)
- (optional?) given an id, return one of some config entries (by name) of the 
entity
- (need to give some more thoughts) relational queries (e.g. given an app id, 
return the app entity along with its containers)

Again, this is not an exhaustive list, or even a completely thought-out list. 
But it should give us some idea on how to define the APIs. Hope this helps...

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >