You can do a rolling restart of the nodes. The customer won't notice and running programs will still complete in good order. If you have rack awareness configured, you can restart as many datanodes in a single rack as you like since that won't compromise replication. Restarting task trackers should be similarly painless since any tasks on that node will just be re-run.
A few jobs may notice small delays in completion due to re-running tasks, but the effect should be minimal, especially if you have speculative execution running. To change configuration on the namenode or jobtracker, you will need to schedule a few seconds of cluster downtime to restart those processes. That should not be a problem since Hadoop should not generally be used in a product situation requiring high availability. On Thu, Aug 13, 2009 at 3:08 PM, Arvind Sharma <arvind...@yahoo.com> wrote: > Sorry,I should have mentioned that - this I want to do without the code > change. > > Something like - I have the cluster up and running and suddenly I realize > that forgot to add some properties in the hadoop-site.xml file. Now I can > add these new properties - but how do these take into effect ? Without > re-starting the cluster (which is in production and customer wouldn't like > that either :-) ) > -- Ted Dunning, CTO DeepDyve