Re: Predicting how many values will I see in a call to reduce?

2010-11-09 Thread Owen O'Malley
On Sun, Nov 7, 2010 at 5:38 AM, Anthony Urso wrote: > Is there any way to know how many values I will see in a call to > reduce without first counting through them all with the iterator? No, there currently isn't. The framework doesn't have the information until the iterator is exhausted. The i

Re: Predicting how many values will I see in a call to reduce?

2010-11-08 Thread Lance Norskog
It is key to the scheduling paradigm of Hadoop that it doesn't have to tell you how many or when. It would have to store up all of the data for your key before activating your reducer. This is exactly what it cannot do and scale. (right?) On Mon, Nov 8, 2010 at 3:32 AM, Niels Basjes wrote: > Hi,

Re: Predicting how many values will I see in a call to reduce?

2010-11-08 Thread Niels Basjes
Hi, 2010/11/7 Anthony Urso > Is there any way to know how many values I will see in a call to > reduce without first counting through them all with the iterator? > > Under 0.21? 0.20? 0.19? > I've looked for an answer to the same question a while ago and came to the conclusion that you can't. T

Predicting how many values will I see in a call to reduce?

2010-11-07 Thread Anthony Urso
Is there any way to know how many values I will see in a call to reduce without first counting through them all with the iterator? Under 0.21? 0.20? 0.19? Thanks, Anthony