[jira] [Updated] (YARN-4074) [timeline reader] implement support for querying for flows and flow runs

2017-10-21 Thread Varun Saxena (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-4074:
---
Fix Version/s: 2.9.0

> [timeline reader] implement support for querying for flows and flow runs
> 
>
> Key: YARN-4074
> URL: https://issues.apache.org/jira/browse/YARN-4074
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
> Fix For: 2.9.0
>
> Attachments: YARN-4074-YARN-2928.007.patch, 
> YARN-4074-YARN-2928.008.patch, YARN-4074-YARN-2928.POC.001.patch, 
> YARN-4074-YARN-2928.POC.002.patch, YARN-4074-YARN-2928.POC.003.patch, 
> YARN-4074-YARN-2928.POC.004.patch, YARN-4074-YARN-2928.POC.005.patch, 
> YARN-4074-YARN-2928.POC.006.patch
>
>
> Implement support for querying for flows and flow runs.
> We should be able to query for the most recent N flows, etc.
> This includes changes to the {{TimelineReader}} API if necessary, as well as 
> implementation of the API.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-4074) [timeline reader] implement support for querying for flows and flow runs

2015-09-17 Thread Sangjin Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sangjin Lee updated YARN-4074:
--
Attachment: YARN-4074-YARN-2928.007.patch

v.7 patch posted.

This is now based on the YARN-2928 branch now that YARN-3901 has been resolved. 
Other than that, there are no real changes from the previous v.6 patch.

> [timeline reader] implement support for querying for flows and flow runs
> 
>
> Key: YARN-4074
> URL: https://issues.apache.org/jira/browse/YARN-4074
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
> Attachments: YARN-4074-YARN-2928.007.patch, 
> YARN-4074-YARN-2928.POC.001.patch, YARN-4074-YARN-2928.POC.002.patch, 
> YARN-4074-YARN-2928.POC.003.patch, YARN-4074-YARN-2928.POC.004.patch, 
> YARN-4074-YARN-2928.POC.005.patch, YARN-4074-YARN-2928.POC.006.patch
>
>
> Implement support for querying for flows and flow runs.
> We should be able to query for the most recent N flows, etc.
> This includes changes to the {{TimelineReader}} API if necessary, as well as 
> implementation of the API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4074) [timeline reader] implement support for querying for flows and flow runs

2015-09-17 Thread Sangjin Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sangjin Lee updated YARN-4074:
--
Attachment: YARN-4074-YARN-2928.008.patch

v.8 patch posted.

Fixed the checkstyle and findbugs issues.

> [timeline reader] implement support for querying for flows and flow runs
> 
>
> Key: YARN-4074
> URL: https://issues.apache.org/jira/browse/YARN-4074
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
> Attachments: YARN-4074-YARN-2928.007.patch, 
> YARN-4074-YARN-2928.008.patch, YARN-4074-YARN-2928.POC.001.patch, 
> YARN-4074-YARN-2928.POC.002.patch, YARN-4074-YARN-2928.POC.003.patch, 
> YARN-4074-YARN-2928.POC.004.patch, YARN-4074-YARN-2928.POC.005.patch, 
> YARN-4074-YARN-2928.POC.006.patch
>
>
> Implement support for querying for flows and flow runs.
> We should be able to query for the most recent N flows, etc.
> This includes changes to the {{TimelineReader}} API if necessary, as well as 
> implementation of the API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4074) [timeline reader] implement support for querying for flows and flow runs

2015-09-16 Thread Sangjin Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sangjin Lee updated YARN-4074:
--
Attachment: YARN-4074-YARN-2928.POC.006.patch

v.6 POC patch posted.

Renamed {{TimelineEntityReader.createTable()}} to 
{{TimelineEntityReader.getTable()}}. Reusing the same instance for a given 
table.

> [timeline reader] implement support for querying for flows and flow runs
> 
>
> Key: YARN-4074
> URL: https://issues.apache.org/jira/browse/YARN-4074
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
> Attachments: YARN-4074-YARN-2928.POC.001.patch, 
> YARN-4074-YARN-2928.POC.002.patch, YARN-4074-YARN-2928.POC.003.patch, 
> YARN-4074-YARN-2928.POC.004.patch, YARN-4074-YARN-2928.POC.005.patch, 
> YARN-4074-YARN-2928.POC.006.patch
>
>
> Implement support for querying for flows and flow runs.
> We should be able to query for the most recent N flows, etc.
> This includes changes to the {{TimelineReader}} API if necessary, as well as 
> implementation of the API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4074) [timeline reader] implement support for querying for flows and flow runs

2015-09-14 Thread Sangjin Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sangjin Lee updated YARN-4074:
--
Attachment: YARN-4074-YARN-2928.POC.005.patch

The POC v.5 patch posted.

It mostly rebases with the v.8 patch for YARN-3901.

It should apply cleanly on top of the v.8 patch for YARN-3901. Again, your 
comments are greatly appreciated.

> [timeline reader] implement support for querying for flows and flow runs
> 
>
> Key: YARN-4074
> URL: https://issues.apache.org/jira/browse/YARN-4074
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
> Attachments: YARN-4074-YARN-2928.POC.001.patch, 
> YARN-4074-YARN-2928.POC.002.patch, YARN-4074-YARN-2928.POC.003.patch, 
> YARN-4074-YARN-2928.POC.004.patch, YARN-4074-YARN-2928.POC.005.patch
>
>
> Implement support for querying for flows and flow runs.
> We should be able to query for the most recent N flows, etc.
> This includes changes to the {{TimelineReader}} API if necessary, as well as 
> implementation of the API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4074) [timeline reader] implement support for querying for flows and flow runs

2015-09-08 Thread Sangjin Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sangjin Lee updated YARN-4074:
--
Attachment: YARN-4074-YARN-2928.POC.004.patch

The v.4 POC patch posted.

- added the XmlElement notation for flow runs in the flow activity entity
- rebased against the v.5 patch for YARN-3901
- added more unit tests
- made sure the id's are set correctly on flow run entities and flow activity 
entities

> [timeline reader] implement support for querying for flows and flow runs
> 
>
> Key: YARN-4074
> URL: https://issues.apache.org/jira/browse/YARN-4074
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
> Attachments: YARN-4074-YARN-2928.POC.001.patch, 
> YARN-4074-YARN-2928.POC.002.patch, YARN-4074-YARN-2928.POC.003.patch, 
> YARN-4074-YARN-2928.POC.004.patch
>
>
> Implement support for querying for flows and flow runs.
> We should be able to query for the most recent N flows, etc.
> This includes changes to the {{TimelineReader}} API if necessary, as well as 
> implementation of the API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4074) [timeline reader] implement support for querying for flows and flow runs

2015-09-04 Thread Sangjin Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sangjin Lee updated YARN-4074:
--
Attachment: YARN-4074-YARN-2928.POC.003.patch

POC v.3 patch posted.

Key changes include
- switched from Get.setMaxResultSize() to PageFilter (more on that below)
- major refactoring of HBaseTimelineReaderImpl
-- introduced TimelineEntityReader and the hierarchy of classes to isolate 
proper reading per type
- added unit tests to test HBaseTimelineReaderImpl for flow activity and flow 
runs
- fixed an issue with FlowScanner where the cells were returned in the wrong 
order so it was breaking Column.readResult()
- made *RowKey classes real object classes, and added the parseRowKey method 
that returns an instance of the RowKey
- fixed the order of the add and pollLast
- renamed FlowEntity to FlowRunEntity
- added the compareTo() method for FlowActivityEntity
- passed the type into the FlowActivityEntity constructor
- set configs for FlowActivityEntity and FlowRunEntity to null
- improved the way we get string values from info for FlowActivityEntity and 
FlowRunEntity
- added getNumberOfRuns() to FlowActivityEntity

It is actually pretty close to being ready, but since YARN-3901 is still 
outstanding, I'm not making it an official patch yet.

As for the PageFilter issue, I concluded setMaxResultSize() is not the right 
API to use to limit the number of rows. I believe the PageFilter is the right 
thing to use. I also added the counting logic to get the right number of 
records even if the result iterator advances.

As for the FlowScanner issue mentioned above, [~vrushalic] and [~jrottinghuis] 
debugged this to track down a bug in YARN-3901. As such, this change will 
likely be made in the final YARN-3901 patch. I just included it here for 
completeness and to make the unit code pass.

You should be able to apply the YARN-3901 v.3 patch and then this patch 
cleanly. Let me know if you have any questions.

I'd greatly appreciate review feedback. I understand it's a lot of code...

> [timeline reader] implement support for querying for flows and flow runs
> 
>
> Key: YARN-4074
> URL: https://issues.apache.org/jira/browse/YARN-4074
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
> Attachments: YARN-4074-YARN-2928.POC.001.patch, 
> YARN-4074-YARN-2928.POC.002.patch, YARN-4074-YARN-2928.POC.003.patch
>
>
> Implement support for querying for flows and flow runs.
> We should be able to query for the most recent N flows, etc.
> This includes changes to the {{TimelineReader}} API if necessary, as well as 
> implementation of the API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4074) [timeline reader] implement support for querying for flows and flow runs

2015-08-27 Thread Sangjin Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sangjin Lee updated YARN-4074:
--
Attachment: YARN-4074-YARN-2928.POC.002.patch

Posting a v.2 POC patch. This adds the flow run query.

As for [~djp]'s comments, yes, I agree that the reader code needs more serious 
refactoring, both in the API as well as the implementation.

I believe [~varun_saxena]'s looking into cleaning up the filters, and so on in 
YARN-3863. So improving the API would be taken up by Varun. Varun?

I'd also like to refactor the implementation more to restructure it. This POC 
patch is by no means an indication of the final form of this patch. I just 
wanted to get it out there so we can ensure it is correct and discuss the 
approach taken here. I hope that clarifies things a bit.

 [timeline reader] implement support for querying for flows and flow runs
 

 Key: YARN-4074
 URL: https://issues.apache.org/jira/browse/YARN-4074
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Affects Versions: YARN-2928
Reporter: Sangjin Lee
Assignee: Sangjin Lee
 Attachments: YARN-4074-YARN-2928.POC.001.patch, 
 YARN-4074-YARN-2928.POC.002.patch


 Implement support for querying for flows and flow runs.
 We should be able to query for the most recent N flows, etc.
 This includes changes to the {{TimelineReader}} API if necessary, as well as 
 implementation of the API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4074) [timeline reader] implement support for querying for flows and flow runs

2015-08-26 Thread Sangjin Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sangjin Lee updated YARN-4074:
--
Attachment: YARN-4074-YARN-2928.POC.001.patch

Posting a v.1 POC patch. This implements the first query (the flow activity 
query). I'll follow it up with another one tomorrow that implements the second 
one too.

This is to get the design choices and correctness reviewed first. It does
- include the flow activity query as part of getEntities()
- create a data container for the flow activity table called FlowActivityEntity

It probably needs a fair amount of refactoring to make the reader code more 
manageable. Also, I need to add unit tests. They will come later.

 [timeline reader] implement support for querying for flows and flow runs
 

 Key: YARN-4074
 URL: https://issues.apache.org/jira/browse/YARN-4074
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Affects Versions: YARN-2928
Reporter: Sangjin Lee
Assignee: Sangjin Lee
 Attachments: YARN-4074-YARN-2928.POC.001.patch


 Implement support for querying for flows and flow runs.
 We should be able to query for the most recent N flows, etc.
 This includes changes to the {{TimelineReader}} API if necessary, as well as 
 implementation of the API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)