I'm using Cloudera's distribution for Hadoop 0.20.1 + 133

The javadocs for package org.apache.hadoop.mapred.join state " For a given key, each operation will consider the cross product of all values for all sources at that node"

I'm doing an inner join between two tables with a text key. One table has multiple values for the same key. I would expect, from the documentation, to see the cross product of the values for a given key represented in the output. Instead I'm simply getting a single row. Does anyone know if this is a bug or if its the intended functionality (and the documentation is flawed)?

table 1
k1 -> a

table 2
k1 ->c
k1 ->d

I should get:
table 1 inner join table 2
k1->ac
k1->ad

Instead I'm getting:
table 1 inner join table 2
k1->ac

Reply via email to