Are you certain that your records are being split into key and value the way you expect. That is the usual reason for odd join behavior. I haven't used the join code past 19.1, however.
On Wed, Nov 18, 2009 at 12:42 PM, Edmund Kohlwey <ekohl...@gmail.com> wrote: > I'm using Cloudera's distribution for Hadoop 0.20.1 + 133 > > The javadocs for package org.apache.hadoop.mapred.join state " For a given > key, each operation will consider the cross product of all values for all > sources at that node" > > I'm doing an inner join between two tables with a text key. One table has > multiple values for the same key. I would expect, from the documentation, to > see the cross product of the values for a given key represented in the > output. Instead I'm simply getting a single row. Does anyone know if this is > a bug or if its the intended functionality (and the documentation is > flawed)? > > table 1 > k1 -> a > > table 2 > k1 ->c > k1 ->d > > I should get: > table 1 inner join table 2 > k1->ac > k1->ad > > Instead I'm getting: > table 1 inner join table 2 > k1->ac > -- Pro Hadoop, a book to guide you from beginner to hadoop mastery, http://www.amazon.com/dp/1430219424?tag=jewlerymall www.prohadoopbook.com a community for Hadoop Professionals