[ 
https://issues.apache.org/jira/browse/YARN-10304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17136542#comment-17136542
 ] 

Adam Antal commented on YARN-10304:
-----------------------------------

Let me start with the bad news. I am very sorry that I come up with this thing 
in v5, but /remote-app-log-dir/user/suffix/ is not enough to find apps using 
the new bucketed path. For users this endpoint is valuable when they know 
exactly where the aggregated logs for the application are. Therefore we need 
another query parameter that can specify the application id, and then the 
{{LogServlet}} can construct the whole path (creating bucket id and 
concatenating the app id as well). If no app id is provided then the behaviour 
should be the same as now. This will probably also need another UT :( 

Regarding the existing patch:
- Regarding the unit tests:
  - This seems to be wrong:
{code:java}
      String path = String.format("%s/%s/bucket-%s-%s",
          YarnConfiguration.DEFAULT_NM_REMOTE_APP_LOG_DIR, remoteUser,
          testSuffix, entry.getFileController().toLowerCase());
{code}
  What is that "bucket-" prefix? That should not be there. Also I don't 
understand why the test is passing. The controller's suffix is initialized in 
{{LogAggregationFileController#extractRemoteRootLogDirSuffix}} and you can 
check that there is no "bucket-" involved in this. Could you investigate this?
  - Since the TFile is always added to the bottom for backward compatibility 
purposes, I recommend defining other controllers. Also we probably need to make 
exact match for the controllers, the {{.contains}} is not enough.
  - need exact match for controllers
  - Could you also write another test case without the user queryparam, but 
setting the login user? You can use {{UserGroupInformation#setLoginUser}}. This 
would make sure that we find the right user when the request processed.
- The new Configuration object in {{WebServletModule}} is a bit of an overkill. 
Maybe the easiest solution is to use {{YarnConfiguration}}'s 
{{YarnConfiguration(Configuration)}} constructor to clone the {{conf}} object 
in {{testRemoteLogDir}} - thus you don't need to bother restoring its state. 
That would be just enough for us.

> Create an endpoint for remote application log directory path query
> ------------------------------------------------------------------
>
>                 Key: YARN-10304
>                 URL: https://issues.apache.org/jira/browse/YARN-10304
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Andras Gyori
>            Assignee: Andras Gyori
>            Priority: Minor
>         Attachments: YARN-10304.001.patch, YARN-10304.002.patch, 
> YARN-10304.003.patch, YARN-10304.004.patch, YARN-10304.005.patch
>
>
> The logic of the aggregated log directory path determination (currently based 
> on configuration) is scattered around the codebase and duplicated multiple 
> times. By providing a separate class for creating the path for a specific 
> user, it allows for an abstraction over this logic. This could be used in 
> place of the previously duplicated logic, moreover, we could provide an 
> endpoint to query this path.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to