Re: Estimating Time required to compute M/Rjob

2011-04-18 Thread real great..
sure,will do..:) On Mon, Apr 18, 2011 at 11:58 AM, Matthew Foley wrote: > R.V., > I was only suggesting one way to tackle the problem; I don't have a list of > appropriate parameters. > I think Ted has much more experience in this area, and he is encouraging > you to stay with the generic approa

Re: Estimating Time required to compute M/Rjob

2011-04-17 Thread Matthew Foley
R.V., I was only suggesting one way to tackle the problem; I don't have a list of appropriate parameters. I think Ted has much more experience in this area, and he is encouraging you to stay with the generic approach. You should study that paper he recommended, the approach looks really powerfu

Re: Estimating Time required to compute M/Rjob

2011-04-17 Thread real great..
@mathew: initially i wanted to concentrate on generic class of applications..wouldnt mind to stick on to one now..can i know something more about the descriptive parameters? @all: any results of anybody having done something similar? On Mon, Apr 18, 2011 at 5:55 AM, James Seigel Tynt wrote: > Y

Re: Estimating Time required to compute M/Rjob

2011-04-17 Thread James Seigel Tynt
Yup. I'm boring On 2011-04-17, at 6:07 PM, Ted Dunning wrote: > Turing completion isn't the central question here, really. The truth > is, map-reduce programs have considerably pressure to be written in a > scalable fashion which limits them to fairly simple behaviors that > result in pretty

Re: Estimating Time required to compute M/Rjob

2011-04-17 Thread Ted Dunning
Turing completion isn't the central question here, really. The truth is, map-reduce programs have considerably pressure to be written in a scalable fashion which limits them to fairly simple behaviors that result in pretty linear dependence of run-time on input size for a given program. The cool

Re: Estimating Time required to compute M/Rjob

2011-04-17 Thread Lance Norskog
ROC Convex Hull is an analysis technique for optimizing parameters for given outputs. For example, if a classification technique has tuning knobs, ROCCH will find the settings that give a desired failure rate. On Sun, Apr 17, 2011 at 12:07 PM, Matthew Foley wrote: > Since general M/R jobs vary o

Re: Estimating Time required to compute M/Rjob

2011-04-17 Thread Matthew Foley
Since general M/R jobs vary over a huge (Turing problem equivalent!) range of behaviors, a more tractable problem might be to characterize the descriptive parameters needed to answer the question: "If the following problem P runs in T0 amount of time on a certain benchmark platform B0, how long

Re: Estimating Time required to compute M/Rjob

2011-04-17 Thread real great..
Thanks a lot guys..will go throught it all. On Sun, Apr 17, 2011 at 3:33 AM, Ted Dunning wrote: > Sounds like this paper might help you: > > Predicting Multiple Performance Metrics for Queries: Better Decisions > Enabled by Machine Learning by Ganapathi, Archana, Harumi Kuno, > Umeshwar Daval, J

Re: Estimating Time required to compute M/Rjob

2011-04-16 Thread Ted Dunning
Sounds like this paper might help you: Predicting Multiple Performance Metrics for Queries: Better Decisions Enabled by Machine Learning by Ganapathi, Archana, Harumi Kuno, Umeshwar Daval, Janet Wiener, Armando Fox, Michael Jordan, & David Patterson http://radlab.cs.berkeley.edu/publication/187

Re: Estimating Time required to compute M/Rjob

2011-04-16 Thread Stephen Boesch
some additional thoughts about the the 'variables' involved in characterizing the M/R application itself. - the configuration of the cluster for numbers of mappers vs reducers compared to the characteristics (amount of work/procesing) required in each of the map/shuffle/reduce stages

Re: Estimating Time required to compute M/Rjob

2011-04-16 Thread Stephen Boesch
You could consider two scenarios / set of requirements for your estimator: 1. Allow it to 'learn' from certain input data and then project running times of similar (or moderately dissimilar) workloads. So the first steps could be to define a couple of relatively small "control" M/R jo

Re: Estimating Time required to compute M/Rjob

2011-04-16 Thread Sonal Goyal
What is your MR job doing? What is the amount of data it is processing? What kind of a cluster do you have? Would you be able to share some details about what you are trying to do? If you are looking for metrics, you could look at the Terasort run .. Thanks and Regards, Sonal

Estimating Time required to compute M/Rjob

2011-04-16 Thread real great..
Hi, As a part of my final year BE final project I want to estimate the time required by a M/R job given an application and a base file system. Can you folks please help me by posting some thoughts on this issue or posting some links here. -- Regards, R.V.