Arun, You can very well run synthetic workloads like large scale sort, wordcount etc or more realistic workloads like PigMix (https://cwiki.apache.org/confluence/display/PIG/PigMix). On a decent enough cluster, these workloads work pretty well. Is there a specific reason why you want traces of varied sizes from various organizations?
> How can i make sure that the rumen generates only say 25 jobs,50 jobs or so Do you want to get 25/50 jobs based on some filtering criterion? I recently faced a similar situation where I wanted to extract jobs from a Rumen trace based on job ids. I will be happy to share these filtering tools. Amar On 12/1/11 8:48 AM, "ArunKumar" <arunk...@gmail.com> wrote: Hi guys ! Apart from generating the job traces from RUMEN , can i get logs or job traces of varied sizes from some organizations. How can i make sure that the rumen generates only say 25 jobs,50 jobs or so ? Thanks, Arun -- View this message in context: http://lucene.472066.n3.nabble.com/Availability-of-Job-traces-or-logs-tp3550462p3550462.html Sent from the Hadoop lucene-users mailing list archive at Nabble.com.