[ 
https://issues.apache.org/jira/browse/HIVE-69?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12652248#action_12652248
 ] 

Ashish Thusoo commented on HIVE-69:
-----------------------------------

Namit,

Can you upload the patch here. The following are my comments on the latest 
patch 

+1

some minor issues are indicated below. Lets create separate JIRAs to address 
those...

mostly minor comments.
Inline Comments
ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java:169    should 
add @Override here.
ql/src/java/org/apache/hadoop/hive/ql/exec/TableScanOperator.java:51    nitpick 
- start with a capital letter.
ql/src/java/org/apache/hadoop/hive/ql/exec/Task.java:112        nitpick - 
please fix the comment here.
ql/src/java/org/apache/hadoop/hive/ql/parse/DefaultDispatcher.java:26   
imported twice?
ql/src/java/org/apache/hadoop/hive/ql/parse/DefaultOpGraphWalker.java:69        
should stack be parametrized here?
ql/src/java/org/apache/hadoop/hive/ql/parse/RuleRegExp.java:34  javadocs for 
all of these...
hadoop/hive/ql/parse/SemanticAnalyzer.java:2999 These can be packaged into 
ParseContext.
hadoop/hive/ql/parse/OperatorProcessor.java:52  Since all the process calls are 
being made using reflection, we can get rid of all the process functions.
hadoop/hive/ql/parse/OperatorProcessor.java:40  This can be made an interface 
instead of a class.
hadoop/hive/ql/parse/SemanticAnalyzer.java:3035 Ideally this could also be done 
through a second pass walker, or it could be done in the same walker as above?
hadoop/hive/ql/parse/RuleRegExp.java:39 Some description would help.
hadoop/hive/ql/parse/DefaultDispatcher.java:26  double inclusion!!
hadoop/hive/ql/parse/DefaultDispatcher.java:63  probably this can just be 
OperatorProcessorContext.class instead of taking the class name and then 
getting the class out of it.
hadoop/hive/ql/parse/DefaultRuleDispatcher.java:40      javadocs for each of 
these variables.
hadoop/hive/ql/optimizer/GenMRFileSink1.java:37 javadocs
hadoop/hive/ql/optimizer/GenMRFileSink1.java:68 Why is there a mapping from 
null to currTask.
hadoop/hive/ql/optimizer/GenMRFileSink1.java:62 Can you explain what is being 
done here. In the original SemanticAnalyzer, we were just setting up the 
dependency from the MR task to move task when we saw the FileSinkOperator. What 
is the rest of the logic for?

> genMapRedTasks does not use the tree walker and uses implicit state which 
> makes it difficult to enhance
> -------------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-69
>                 URL: https://issues.apache.org/jira/browse/HIVE-69
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Query Processor
>            Reporter: Namit Jain
>            Assignee: Namit Jain
>
> In SemanticAnalyzer, genmapredtasks() does not use a tree walker. For 
> map-side joins, the taskplan needs to be enhanced to be possibly
> broken at MapSink also. Basically, the code is very difficult to enhance 
> since there are implicit assumptions that reduce sink is the only
> operator where the plan breaks.
> This should be enhanced so that the user can implement their own task 
> generation logic which is independent of the tree walking.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to