If you are looking for exponential output volumes then run a market basket test. You will need only a few thousand baskets with 100 items or so. Set min support very low (5%) and confidence at 80% and you should get a flood of data. Any FIS algorythm should do the job.

On Fri, Apr 29, 2011 at 8:02 AM, elton sky wrote:

One of assumptions map reduce made, I think, is that size of map's output is smaller than input. Although we can see many applications have the same size
of output with input, like, sort, merge,etc.
For my benchmark purpose, I am looking for some non-trivial, real life
applications which creates *bigger* output than its input. Trivial example I
can think about is cross join...

I really appreciate if you share your knowledge with me.

Reply via email to