How frequently can I set status?

2010-12-23 Thread W.P. McNeill
I have a loop that runs over a large number of iterations (order of 100,000)
very quickly.  It is nice to do context.setStatus() with an indication of
where I am in the loop.  Currently I'm only calling setStatus() every 10,000
iterations because I don't want to overwhelm the task trackers with lots of
status messages.  Is this something I should be worried, about or is Hadoop
designed to handle a high volume of status messages?  If so, I'll just call
setStatus() every iteration.


Re: How frequently can I set status?

2010-12-23 Thread Ted Dunning
It is reasonable to update counters often, but I think you are right to
limit the number status updates.

On Thu, Dec 23, 2010 at 11:15 AM, W.P. McNeill  wrote:

> I have a loop that runs over a large number of iterations (order of
> 100,000)
> very quickly.  It is nice to do context.setStatus() with an indication of
> where I am in the loop.  Currently I'm only calling setStatus() every
> 10,000
> iterations because I don't want to overwhelm the task trackers with lots of
> status messages.  Is this something I should be worried, about or is Hadoop
> designed to handle a high volume of status messages?  If so, I'll just call
> setStatus() every iteration.
>


Re: How frequently can I set status?

2010-12-23 Thread Ken
If I remember correctly, status is only sent on heartbeat. Which means if you 
are setting inside a fast running loop, you won't see every status message, 
only the status message that was current when the heartbeat was sent to the 
jobtracker. 

Sent from my iPad

On Dec 23, 2010, at 11:41 AM, Ted Dunning  wrote:

> It is reasonable to update counters often, but I think you are right to
> limit the number status updates.
> 
> On Thu, Dec 23, 2010 at 11:15 AM, W.P. McNeill  wrote:
> 
>> I have a loop that runs over a large number of iterations (order of
>> 100,000)
>> very quickly.  It is nice to do context.setStatus() with an indication of
>> where I am in the loop.  Currently I'm only calling setStatus() every
>> 10,000
>> iterations because I don't want to overwhelm the task trackers with lots of
>> status messages.  Is this something I should be worried, about or is Hadoop
>> designed to handle a high volume of status messages?  If so, I'll just call
>> setStatus() every iteration.
>> 


Re: How frequently can I set status?

2010-12-23 Thread W.P. McNeill
I figured that was the case and it's okay if I don't see every status
message, as long as it doesn't hurt anything to send them.

On Thu, Dec 23, 2010 at 12:51 PM, Ken  wrote:

> If I remember correctly, status is only sent on heartbeat. Which means if
> you are setting inside a fast running loop, you won't see every status
> message, only the status message that was current when the heartbeat was
> sent to the jobtracker.
>
> Sent from my iPad
>
> On Dec 23, 2010, at 11:41 AM, Ted Dunning  wrote:
>
> > It is reasonable to update counters often, but I think you are right to
> > limit the number status updates.
> >
> > On Thu, Dec 23, 2010 at 11:15 AM, W.P. McNeill 
> wrote:
> >
> >> I have a loop that runs over a large number of iterations (order of
> >> 100,000)
> >> very quickly.  It is nice to do context.setStatus() with an indication
> of
> >> where I am in the loop.  Currently I'm only calling setStatus() every
> >> 10,000
> >> iterations because I don't want to overwhelm the task trackers with lots
> of
> >> status messages.  Is this something I should be worried, about or is
> Hadoop
> >> designed to handle a high volume of status messages?  If so, I'll just
> call
> >> setStatus() every iteration.
> >>
>