In a more "sysadminish" sense... has anyone tried out rocks[1] for
hadoop cluster deployment/management ? I'm about to start with it...

[1] http://www.rocksclusters.org

On Tue, May 20, 2008 at 5:23 AM, Alejandro Abdelnur <[EMAIL PROTECTED]> wrote:
> That would be an option too.
>
> On Mon, May 19, 2008 at 10:26 PM, Ted Dunning <[EMAIL PROTECTED]> wrote:
>>
>> I think it would be better to have the client retrieve the default
>> configuration.  Not all configuration settings are simple overrides.   Some
>> are read-modify-write operations.
>>
>> This also fits the current code better.
>>
>>
>> On 5/19/08 6:38 AM, "Steve Loughran" <[EMAIL PROTECTED]> wrote:
>>
>>> Alejandro Abdelnur wrote:
>>>> A while ago I've opened an issue related to this topic
>>>>
>>>>   https://issues.apache.org/jira/browse/HADOOP-3287
>>>>
>>>> My take is a little different, when submitting a job, the clients
>>>> should only send to the jobtracker the configuration they explicitly
>>>> set, then the job tracker would apply the defaults for all the other
>>>> configuration.
>>>>
>>>> By doing this the cluster admin can modify things at any time and
>>>> changes on default values take effect for all clients without having
>>>> to distribute a new configuration to all clients.
>>>>
>>>> IMO, this approach was the intended behavior at some point, according
>>>> to the Configuration.write(OutputStream) javadocs ' Writes non-default
>>>> properties in this configuration.'. But as the write method is writing
>>>> default properties this is not happening.
>>>
>>> I'll keep an eye on that issue. I think a key problem right now is that
>>> clients take their config from the configuration file in the core jar,
>>> and from their own settings, You need to keep the settings in sync
>>> somehow, and have to take what the core jar provides.
>>>
>>>
>>>> This approach would also get rid of the separate mechanism (zookeeper,
>>>> svn, etc) to keep clients synchronized as there would be no need to do
>>>> so.
>>>
>>> zookeeper and similar are to keep the cluster alive; they shouldnt be
>>> needed for clients, which should only need some URL of a job tracker to
>>> talk to.
>>
>>
>

Reply via email to