If you are looking for exponential output volumes then run a market
basket test. You will need only a few thousand baskets with 100 items
or so. Set min support very low (5%) and confidence at 80% and you
should get a flood of data. Any FIS algorythm should do the job.
On Fri, Apr 29, 2011 at 8:02 AM, elton sky wrote:
One of assumptions map reduce made, I think, is that size of map's
output is
smaller than input. Although we can see many applications have the
same size
of output with input, like, sort, merge,etc.
For my benchmark purpose, I am looking for some non-trivial, real life
applications which creates *bigger* output than its input. Trivial
example I
can think about is cross join...
I really appreciate if you share your knowledge with me.