[ https://issues.apache.org/jira/browse/MAPREDUCE-861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12748791#action_12748791 ]
Hemanth Yamijala commented on MAPREDUCE-861: -------------------------------------------- I've started taking a look at this code, and have come till reviewing the framework changes. Here are some comments: - I would suggest that we move out the parsing code for the new queues format into a separate class like QueueHierarchyBuilder, in order to keep the QueueManager class simple. - I would also suggest that we make the deprecated and new class stick to a unified interface that provides APIs like: {code} Queue getRoot(); boolean areAclsEnabled(); {code} These can be used by the QueueManager on the appropriate builder depending on whether old configuration is available or not. This way, the initialization code (like in the constructor) of QueueManager will be consistent, irrespective of where the queue configuration is. - Also, since we want to discard the mapred-site.xml loaded properties ASAP, we probably should parse and build the hierarchy from the constructor of the builder itself. - createHierarchy should be refactored to parse leaf level queues and non leaf level queues separately. I would suggest something along the lines of: {code} for each queue node: newQueue.name = parseName() newQueue.properties = parseProperties() if (node is a leaf level queue) { parseLeafQueueProperties() // here we parse ACLs, state and other properties. } else { for every queue node: recursively parse queue nodes. } {code} - Calling getInnerQueues without calling addChild can cause a null pointer because children is null. - Also, the method getInnerQueues doesn't seem to be used. So, is it required ? - hashCode is overridden, but not equals - I assume the order of child elements of an XML tag will be in the order in which they are defined in the file. I think we must explicitly ensure that elements come in a fixed order. For e.g. currently, we are not enforcing an order on how tags under the 'queue' node are defined. This could lead to cases where if a 'queue' element comes before the 'name' element, we would call parse hierarchy for the child queue without the parent's name. - Shouldn't we check in populateProperties that the node is of type element. - Since getNamedItem can be null, we should check for it in populateProperties. This will allow us to throw a better exception. - In QueueManager.initialize, allQueues.putAll(leafQueues) seems unnecessary, since getInnerQueues already does this. - setSchedulerInfo and getSchedulerInfo should work on all queues, not just leaf queues. - The current refresh code is broken with reference to what we want to do long term. We should only change properties of leaf queues. Since we are planning to change the refresh model in MAPREDUCE-893, I think for completeness sake the refresh of the code should build the entire hierarchy, but pickup only leaf level queues and copy properties like acls and state. Currently, since the refresh code is adding to the root queues, hierarchy could be affected which is not in the scope even of MAPREDUCE-893. - getJobQueueInfos is calling getJobQueueInfo twice. We should avoid this. Not a bug in the patch, as it was already there, but it will help to fix it. - dumpConfiguration should be changed. It should be changed to the new JSON format. - DeprecatedQueueHierarchyBuilder should not hold on to an instance of the configuration Some minor comments on style: > Modify queue configuration format and parsing to support a hierarchy of > queues. > ------------------------------------------------------------------------------- > > Key: MAPREDUCE-861 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-861 > Project: Hadoop Map/Reduce > Issue Type: Sub-task > Reporter: Hemanth Yamijala > Assignee: rahul k singh > Attachments: MAPREDUCE-861-1.patch > > > MAPREDUCE-853 proposes to introduce a hierarchy of queues into the Map/Reduce > framework. This JIRA is for defining changes to the configuration related to > queues. > The current format for defining a queue and its properties is as follows: > mapred.queue.<queue-name>.<property-name>. For e.g. > mapred.queue.<queue-name>.acl-submit-job. The reason for using this verbose > format was to be able to reuse the Configuration parser in Hadoop. However, > administrators currently using the queue configuration have already indicated > a very strong desire for a more manageable format. Since, this becomes more > unwieldy with hierarchical queues, the time may be good to introduce a new > format for representing queue configuration. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.