[ 
https://issues.apache.org/jira/browse/MAPREDUCE-861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12748792#action_12748792
 ] 

Hemanth Yamijala commented on MAPREDUCE-861:
--------------------------------------------

I've started taking a look at this code, and have come till reviewing the 
framework changes. Here are some comments:

- I would suggest that we move out the parsing code for the new queues format 
into a separate class like QueueHierarchyBuilder, in order to keep the 
QueueManager class simple.
- I would also suggest that we make the deprecated and new class stick to a 
unified interface that provides APIs like:
{code}
Queue getRoot();
boolean areAclsEnabled();
{code}
These can be used by the QueueManager on the appropriate builder depending on 
whether old configuration is available or not. This way, the initialization 
code (like in the constructor) of QueueManager will be consistent, irrespective 
of where the queue configuration is.
- Also, since we want to discard the mapred-site.xml loaded properties ASAP, we 
probably should parse and build the hierarchy from the constructor of the 
builder itself.
- createHierarchy should be refactored to parse leaf level queues and non leaf 
level queues separately. I would suggest something along the lines of:
{code}
for each queue node:
  newQueue.name = parseName()
  newQueue.properties = parseProperties()
  if (node is a leaf level queue) {
    parseLeafQueueProperties() // here we parse ACLs, state and other 
properties.
  } else {
    for every queue node:
      recursively parse queue nodes.
  }
{code}
- Calling getInnerQueues without calling addChild can cause a null pointer 
because children is null.
- hashCode is overridden, but not equals
- I assume the order of child elements of an XML tag will be in the order in 
which they are defined in the file. I think we must explicitly ensure that 
elements come in a fixed order. For e.g. currently, we are not enforcing an 
order on how tags under the 'queue' node are defined. This could lead to cases 
where if a 'queue' element comes before the 'name' element, we would call parse 
hierarchy for the child queue without the parent's name.
- Shouldn't we check in populateProperties that the node is of type element.
- Since getNamedItem can be null, we should check for it in populateProperties. 
This will allow us to throw a better exception.
- In QueueManager.initialize, allQueues.putAll(leafQueues) seems unnecessary, 
since getInnerQueues already does this.
- setSchedulerInfo and getSchedulerInfo should work on all queues, not just 
leaf queues.
- The current refresh code is broken with reference to what we want to do long 
term. We should only change properties of leaf queues. Since we are planning to 
change the refresh model in MAPREDUCE-893, I think for completeness sake the 
refresh of the code should build the entire hierarchy, but pickup only leaf 
level queues and copy properties like acls and state. Currently, since the 
refresh code is adding to the root queues, hierarchy could be affected which is 
not in the scope even of MAPREDUCE-893.
- getJobQueueInfos is calling getJobQueueInfo twice. We should avoid this. Not 
a bug in the patch, as it was already there, but it will help to fix it.
- dumpConfiguration should be changed. It should be changed to the new JSON 
format.
- DeprecatedQueueHierarchyBuilder should not hold on to an instance of the 
configuration

Some minor comments on style:
- Import only required classes and not all classes in a package.
- getInnerQueues can be renamed as getDescendentQueues
- DeprecatedQueueHierarchyBuilder.loadingDeprecatedResource can be loadQueues() 
or createQueues().
- I think we are using "acls" as an XML attribute of the queues element to 
determine if ACLs are enabled or not. I would suggest we name it as aclsEnabled.
- QueueManager.validate should be better documented to specify what it's 
validating.
- There's an empty statement in QueueManager.populateProperties ( just a ';' )

I am unable to decide on these points:
- Do we need leafQueues and allQueues stored separately ?
- Should we validate using an XSD ? I know you were planning on this. After 
looking at the code, I am thinking this makes good sense.
- I am unable to decide if refreshConfiguration should be in the base class for 
QueueHierarchyBuilder, because it would need a Configuration in the deprecated 
case, and none in the new case.


> Modify queue configuration format and parsing to support a hierarchy of 
> queues.
> -------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-861
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-861
>             Project: Hadoop Map/Reduce
>          Issue Type: Sub-task
>            Reporter: Hemanth Yamijala
>            Assignee: rahul k singh
>         Attachments: MAPREDUCE-861-1.patch
>
>
> MAPREDUCE-853 proposes to introduce a hierarchy of queues into the Map/Reduce 
> framework. This JIRA is for defining changes to the configuration related to 
> queues. 
> The current format for defining a queue and its properties is as follows: 
> mapred.queue.<queue-name>.<property-name>. For e.g. 
> mapred.queue.<queue-name>.acl-submit-job. The reason for using this verbose 
> format was to be able to reuse the Configuration parser in Hadoop. However, 
> administrators currently using the queue configuration have already indicated 
> a very strong desire for a more manageable format. Since, this becomes more 
> unwieldy with hierarchical queues, the time may be good to introduce a new 
> format for representing queue configuration.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to