Raghavi Ravi created OOZIE-3057:
-----------------------------------

             Summary: Custom Partitioner not working in Oozie Mapreduce action
                 Key: OOZIE-3057
                 URL: https://issues.apache.org/jira/browse/OOZIE-3057
             Project: Oozie
          Issue Type: Bug
          Components: action, workflow
    Affects Versions: 4.1.0
         Environment: Red Hat Enterprise Linux Server release 7.2 (Maipo)
Linux version 3.10.0-327.10.1.el7.x86_64 
(mockbu...@x86-021.build.eng.bos.redhat.com) (gcc version 4.8.5 20150623 (Red 
Hat 4.8.5-4) (GCC) ) #1 SMP Sat Jan 23 04:54:55 EST 2016
cdh version - 5.10.1
oozie version - 4.1.0
Hue™ 3.11 - The Hadoop UI
            Reporter: Raghavi Ravi


I implemented secondary sort in mapreduce using old API 
(org.apache.hadoop.mapred.*) and trying to execute it using Oozie (From Hue).

Though I have set the partitioner class in the properties, the partitioner is 
not being executed. So, I'm not getting output as expected.

The same code runs fine when run using hadoop command from CLI.

And here is my workflow.xml

<workflow-app name="MyTriplets" xmlns="uri:oozie:workflow:0.5">
<start to="mapreduce-598d"/>
<kill name="Kill">
    <message>Action failed, error 
message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<action name="mapreduce-598d">
    <map-reduce>
        <job-tracker>${jobTracker}</job-tracker>
        <name-node>${nameNode}</name-node>
        <configuration>
            <property>
                <name>mapred.output.dir</name>
                <value>/test_1109_3</value>
            </property>
            <property>
                <name>mapred.input.dir</name>
                
<value>/apps/hive/warehouse/7360_0609_rx/day=06-09-2017/hour=13/quarter=2/,/apps/hive/warehouse/7360_0609_tx/day=06-09-2017/hour=13/quarter=2/,/apps/hive/warehouse/7360_0509_util/day=05-09-2017/hour=16/quarter=1/</value>
            </property>
            <property>
                <name>mapred.input.format.class</name>
                <value>org.apache.hadoop.hive.ql.io.RCFileInputFormat</value>
            </property>
            <property>
                <name>mapred.mapper.class</name>
                <value>PonRankMapper</value>
            </property>
            <property>
                <name>mapred.reducer.class</name>
                <value>PonRankReducer</value>
            </property>
            <property>
                <name>mapred.output.value.comparator.class</name>
                <value>PonRankGroupingComparator</value>
            </property>
            <property>
                <name>mapred.mapoutput.key.class</name>
                <value>PonRankPair</value>
            </property>
            <property>
                <name>mapred.mapoutput.value.class</name>
                <value>org.apache.hadoop.io.Text</value>
            </property>
            <property>
                <name>mapred.reduce.output.key.class</name>
                <value>org.apache.hadoop.io.NullWritable</value>
            </property>
            <property>
                <name>mapred.reduce.output.value.class</name>
                <value>org.apache.hadoop.io.Text</value>
            </property>
            <property>
                <name>mapred.reduce.tasks</name>
                <value>1</value>
            </property>
            <property>
                <name>mapred.partitioner.class</name>
                <value>PonRankPartitioner</value>
            </property>
            <property>
                <name>mapred.mapper.new-api</name>
                <value>False</value>
            </property>
        </configuration>
    </map-reduce>
    <ok to="End"/>
    <error to="Kill"/>
</action>
<end name="End"/>

When running using hadoop jar command, I set the partitioner class using 
JobConf.setPartitionerClass API.

 Partitioner is not executed when using old API . Inspite of adding the 
property.

            <property>
                <name>mapred.partitioner.class</name>
                <value>PonRankPartitioner</value>
            </property>

Executed the same logic using new API's (org.apache.hadoop.mapreduce) and added 
mapreduce.partitioner.class property in workflow.

Partitioner was executed and desired outcome was seen.





--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to