er?

It seems to be using value.get(). That having been said, you should really
partition based on key, not on value. (I am not sure why, exactly, the value
is provided to the getPartition() method.)


Moreover, I think the problem is that you are using division ( / ) not
modulus ( % ).  Your code simplifies to:   (value.get() / T) / (T /
numPartitions) = value.get() * numPartitions / T^2.

The contract of getPartition() is that it returns a value in [0,
numPartitions). The division operators are not guaranteed to return anything
in this range, but (foo % numPartitions) will always do the right thing. So
it's probably just  assigning everything to reduce partition 0.
(Alternatively, it could be that value * numPartitions < T^2 for any values
of T you're testing with, which means that integer division will return 0.)

- Aaron


On Fri, Jan 30, 2009 at 3:43 PM, Sandy <snickerdoodl...@gmail.com> wrote:

> Hi James,
>
> Thank you very much! :-)
>
> -SM
>
> On Fri, Jan 30, 2009 at 4:17 PM, james warren <ja...@rockyou.com> wrote:
>
> > Hello Sandy -
> > Your partitioner isn't using any information from the key/value pair -
> it's
> > only using the value T which is read once from the job configuration.
> >  getPartition() will always return the same value, so all of your data is
> > being sent to one reducer. :P
> >
> > cheers,
> > -James
> >
> > On Fri, Jan 30, 2009 at 1:32 PM, Sandy <snickerdoodl...@gmail.com>
> wrote:
> >
> > > Hello,
> > >
> > > Could someone point me toward some more documentation on how to write
> > one's
> > > own partition class? I have having quite a bit of trouble getting mine
> to
> > > work. So far, it looks something like this:
> > >
> > > public class myPartitioner extends MapReduceBase implements
> > > Partitioner<IntWritable, IntWritable> {
> > >
> > >    private int T;
> > >
> > >    public void configure(JobConf job) {
> > >    super.configure(job);
> > >    String myT = job.get("tval");        //this is user defined
> > >    T = Integer.parseInt(myT);
> > >    }
> > >
> > >    public int getPartition(IntWritable key, IntWritable value, int
> > > numReduceTasks) {
> > >        int newT = (T/numReduceTasks);
> > >        int id = ((value.get()/ T);
> > >        return (int)(id/newT);
> > >    }
> > > }
> > >
> > > In the run() function of my M/R program I just set it using:
> > >
> > > conf.setPartitionerClass(myPartitioner.class);
> > >
> > > Is there anything else I need to set in the run() function?
> > >
> > >
> > > The code compiles fine. When I run it, I know it is "using" the
> > > partitioner,
> > > since I get different output than if I just let it use HashPartitioner.
> > > However, it is not splitting between the reducers at all! If I set the
> > > number of reducers to 2, all the output shows up in part-00000, while
> > > part-00001 has nothing.
> > >
> > > I am having trouble debugging this since I don't know how I can observe
> > the
> > > values of numReduceTasks (which I assume is being set by the system).
> Is
> > > this a proper assumption?
> > >
> > > If I try to insert any println() statements in the function, it isn't
> > > outputted to either my terminal or my log files. Could someone give me
> > some
> > > general advice on how best to debug pieces of code like this?
> > >
> >
>

Reply via email to