[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13418165#comment-13418165
 ] 

Ahmed Radwan commented on MAPREDUCE-4346:
-----------------------------------------

Thanks Arun,

Sorry, it took me sometime to get back to this. Based on your above reference 
to MR2, I discussed it more with tucu offline. I wanted to reassess the 
situation and understand more the difficulty you are referring to.

In MR2, we already have getAllJobs() that returns all jobs in any statuses. So 
to support the new refined version (similar to the MR1 version proposed here), 
we have two options for filtering this list:

* 1) Client-side filtering: The new implementation will just call getAllJobs() 
and the list will be filtered in the JobClient. Obviously, this option is just 
providing the required compatibility without removing the overhead we discussed 
earlier. So, I wouldn't prefer this option.

* 2) Resource Manager filtering: Currently, getAllJobs() in MR2 uses the 
TypeConverter to convert the whole list of returned jobs from 
List<ApplicationReport> to JobStatus[] to be compatible with MR1. So to be able 
to filter this list and avoid doing this type conversion on the server-side, we 
can have the JobClient do this conversion before sending the request to the 
resource manager.

Independent of this refined version of getAllJobs(), there is also more stuff 
that need to be done in MR2 in this context, like:

* Dealing with retiredJobs and separately getting them from the history server 
(if requested).
* Dealing with different applications types, since it doesn't make sense for an 
MR client to get statuses for distributed shell jobs, or other types of 
applications that are submitted by other types of clients, etc.

I'll file a separate jira for these MR2 changes, and will work on a patch for 
it. Please let me know if you have any comments or considerations for this 
route.
                
> Adding a refined version of JobTracker.getAllJobs() and exposing through the 
> JobClient
> --------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-4346
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4346
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: mrv1
>            Reporter: Ahmed Radwan
>            Assignee: Ahmed Radwan
>         Attachments: MAPREDUCE-4346.patch, MAPREDUCE-4346_rev2.patch, 
> MAPREDUCE-4346_rev3.patch, MAPREDUCE-4346_rev4.patch
>
>
> The current implementation for JobTracker.getAllJobs() returns all submitted 
> jobs in any state, in addition to retired jobs. This list can be long and 
> represents an unneeded overhead especially in the case of clients only 
> interested in jobs in specific state(s). 
> It is beneficial to include a refined version where only jobs having specific 
> statuses are returned and retired jobs are optional to include. 
> I'll be uploading an initial patch momentarily.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to