[jira] [Commented] (YARN-3076) YarnClient implementation to retrieve label to node mapping

2015-02-18 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14326861#comment-14326861
 ] 

Wangda Tan commented on YARN-3076:
--

Thanks, patch LGTM, will commit after Jenkins get back.

 YarnClient implementation to retrieve label to node mapping
 ---

 Key: YARN-3076
 URL: https://issues.apache.org/jira/browse/YARN-3076
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: client
Affects Versions: 2.7.0
Reporter: Varun Saxena
Assignee: Varun Saxena
 Attachments: YARN-3076.001.patch, YARN-3076.002.patch, 
 YARN-3076.003.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2942) Aggregated Log Files should be combined

2015-02-18 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14326958#comment-14326958
 ] 

Vinod Kumar Vavilapalli commented on YARN-2942:
---

bq. The problem here is that the aggregated log files are not in an 
append-friendly format (TFile). We'd have to change the file format that 
they're in (perhaps reusing the similar format I created in this patch), but 
this wouldn't be backwards compatible.
Precisely the point, I think we should have an append-friendly format - an 
extension of today's TFile. YARN-2548 also needs the same extension. We can try 
making this a compatible evolution. Even if we cannot, we can simply just 
support both the formats for compat.

 Aggregated Log Files should be combined
 ---

 Key: YARN-2942
 URL: https://issues.apache.org/jira/browse/YARN-2942
 Project: Hadoop YARN
  Issue Type: New Feature
Affects Versions: 2.6.0
Reporter: Robert Kanter
Assignee: Robert Kanter
 Attachments: CombinedAggregatedLogsProposal_v3.pdf, 
 CompactedAggregatedLogsProposal_v1.pdf, 
 CompactedAggregatedLogsProposal_v2.pdf, YARN-2942-preliminary.001.patch, 
 YARN-2942-preliminary.002.patch, YARN-2942.001.patch, YARN-2942.002.patch, 
 YARN-2942.003.patch


 Turning on log aggregation allows users to easily store container logs in 
 HDFS and subsequently view them in the YARN web UIs from a central place.  
 Currently, there is a separate log file for each Node Manager.  This can be a 
 problem for HDFS if you have a cluster with many nodes as you’ll slowly start 
 accumulating many (possibly small) files per YARN application.  The current 
 “solution” for this problem is to configure YARN (actually the JHS) to 
 automatically delete these files after some amount of time.  
 We should improve this by compacting the per-node aggregated log files into 
 one log file per application.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3218) Implement CombinedAggregatedLogFormat Reader and Writer

2015-02-18 Thread Robert Kanter (JIRA)
Robert Kanter created YARN-3218:
---

 Summary: Implement CombinedAggregatedLogFormat Reader and Writer
 Key: YARN-3218
 URL: https://issues.apache.org/jira/browse/YARN-3218
 Project: Hadoop YARN
  Issue Type: Sub-task
Affects Versions: 2.8.0
Reporter: Robert Kanter
Assignee: Robert Kanter


We need to create a Reader and Writer for the CombinedAggregatedLogFormat



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2942) Aggregated Log Files should be combined

2015-02-18 Thread Robert Kanter (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14326821#comment-14326821
 ] 

Robert Kanter commented on YARN-2942:
-

I've created 4 subtasks (one is in HADOOP):
# HADOOP-11612: Workaround for Curator's ChildReaper requiring Guava 15+
# YARN-3218: Implement CombinedAggregatedLogFormat Reader and Writer
# YARN-3219: Use CombinedAggregatedLogFormat Writer to combine aggregated log 
files
# YARN-3220: JHS should display Combined Aggregated Logs when available

 Aggregated Log Files should be combined
 ---

 Key: YARN-2942
 URL: https://issues.apache.org/jira/browse/YARN-2942
 Project: Hadoop YARN
  Issue Type: New Feature
Affects Versions: 2.6.0
Reporter: Robert Kanter
Assignee: Robert Kanter
 Attachments: CombinedAggregatedLogsProposal_v3.pdf, 
 CompactedAggregatedLogsProposal_v1.pdf, 
 CompactedAggregatedLogsProposal_v2.pdf, YARN-2942-preliminary.001.patch, 
 YARN-2942-preliminary.002.patch, YARN-2942.001.patch, YARN-2942.002.patch, 
 YARN-2942.003.patch


 Turning on log aggregation allows users to easily store container logs in 
 HDFS and subsequently view them in the YARN web UIs from a central place.  
 Currently, there is a separate log file for each Node Manager.  This can be a 
 problem for HDFS if you have a cluster with many nodes as you’ll slowly start 
 accumulating many (possibly small) files per YARN application.  The current 
 “solution” for this problem is to configure YARN (actually the JHS) to 
 automatically delete these files after some amount of time.  
 We should improve this by compacting the per-node aggregated log files into 
 one log file per application.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3122) Metrics for container's actual CPU usage

2015-02-18 Thread Anubhav Dhoot (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anubhav Dhoot updated YARN-3122:

Attachment: YARN-3122.002.patch

Fixed the test failure. The test was failing because the test was doing integer 
truncation during the calculation. Fixed it so the calculations were done as 
float

 Metrics for container's actual CPU usage
 

 Key: YARN-3122
 URL: https://issues.apache.org/jira/browse/YARN-3122
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Affects Versions: 2.6.0
Reporter: Anubhav Dhoot
Assignee: Anubhav Dhoot
 Attachments: YARN-3122.001.patch, YARN-3122.002.patch, 
 YARN-3122.prelim.patch, YARN-3122.prelim.patch


 It would be nice to capture resource usage per container, for a variety of 
 reasons. This JIRA is to track CPU usage. 
 YARN-2965 tracks the resource usage on the node, and the two implementations 
 should reuse code as much as possible. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3087) [Aggregator implementation] the REST server (web server) for per-node aggregator does not work if it runs inside node manager

2015-02-18 Thread Li Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14326830#comment-14326830
 ] 

Li Lu commented on YARN-3087:
-

I found this issue (#635 of Guice, 
https://code.google.com/p/google-guice/issues/detail?id=635) that may be 
related to our problem here. However, seems like the (only locally tested) 
patch has never been committed. Not sure if it helps though... 

 [Aggregator implementation] the REST server (web server) for per-node 
 aggregator does not work if it runs inside node manager
 -

 Key: YARN-3087
 URL: https://issues.apache.org/jira/browse/YARN-3087
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Sangjin Lee
Assignee: Devaraj K

 This is related to YARN-3030. YARN-3030 sets up a per-node timeline 
 aggregator and the associated REST server. It runs fine as a standalone 
 process, but does not work if it runs inside the node manager due to possible 
 collisions of servlet mapping.
 Exception:
 {noformat}
 org.apache.hadoop.yarn.webapp.WebAppException: /v2/timeline: controller for 
 v2 not found
   at org.apache.hadoop.yarn.webapp.Router.resolveDefault(Router.java:232)
   at org.apache.hadoop.yarn.webapp.Router.resolve(Router.java:140)
   at org.apache.hadoop.yarn.webapp.Dispatcher.service(Dispatcher.java:134)
   at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
   at 
 com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:263)
   at 
 com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:178)
   at 
 com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:91)
   at 
 com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:62)
   at 
 com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:900)
   at 
 com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:834)
   at 
 com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:795)
 ...
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2942) Aggregated Log Files should be combined

2015-02-18 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14326912#comment-14326912
 ] 

Vinod Kumar Vavilapalli commented on YARN-2942:
---

Apologies for coming in real late. I've been thinking about this problem for a 
long time, since before YARN came to Apache :)

I think HDFS-3689 will help a lot in this area. Offline I was requesting HDFS 
folks to help make progress there. Now that that got in, I think we should 
consider using that as the first step. It should help reduce the file-count 
completely, even though the block count problem is still unresolved. The long 
term solution for the later really is HDFS supporting atomic append (with 
concurrent writers) - it's better to get the problem fixed at the storage layer.

We should try to avoid rereading the entire log file and rewriting again. How 
about we try the concat approach (with variable length blocks) first before we 
try the reread+rewrite?

 Aggregated Log Files should be combined
 ---

 Key: YARN-2942
 URL: https://issues.apache.org/jira/browse/YARN-2942
 Project: Hadoop YARN
  Issue Type: New Feature
Affects Versions: 2.6.0
Reporter: Robert Kanter
Assignee: Robert Kanter
 Attachments: CombinedAggregatedLogsProposal_v3.pdf, 
 CompactedAggregatedLogsProposal_v1.pdf, 
 CompactedAggregatedLogsProposal_v2.pdf, YARN-2942-preliminary.001.patch, 
 YARN-2942-preliminary.002.patch, YARN-2942.001.patch, YARN-2942.002.patch, 
 YARN-2942.003.patch


 Turning on log aggregation allows users to easily store container logs in 
 HDFS and subsequently view them in the YARN web UIs from a central place.  
 Currently, there is a separate log file for each Node Manager.  This can be a 
 problem for HDFS if you have a cluster with many nodes as you’ll slowly start 
 accumulating many (possibly small) files per YARN application.  The current 
 “solution” for this problem is to configure YARN (actually the JHS) to 
 automatically delete these files after some amount of time.  
 We should improve this by compacting the per-node aggregated log files into 
 one log file per application.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


<    1   2