Re: Hadoop Cluster Administration Tools?

Steve Loughran Thu, 01 May 2008 06:12:23 -0700

Khalil Honsali wrote:

Thanks Mr. Steve, and everyone..


I actually have just 16 machines (normal P4 PCs), so in case I need to do
things manually it takes half an hour (for example when installing sun-java,
I had to type that 'yes' for each .bin install)
but for now i'm ok with pssh or just a simple custom script, however, I'm
afraid things will get complicated soon enough...

You said:
"you can automate rpm install using pure "rpm" command, and check for
installed artifacts yourself"
Could you please explain more, I understand you run the same rpm against all
machines provided the cluster is homogeneous.


1. you can push out the same RPM files to all machines.

2. if you use rpmbuild (ant's <rpm> task does this), you can build yourown RPMs and push them out, possibly with scp, then run ssh to install them.

http://wiki.smartfrog.org/wiki/display/sf/RPM+Files

3. A lot of linux distros have adopted Yum
http://wiki.smartfrog.org/wiki/display/sf/Pattern+-+Yum

I was discussing Yum support on the Config-Management list last week,funnily enough

http://lopsa.org/pipermail/config-mgmt/2008-April/000662.html

Nobody likes automating it much as
 -it doesnt provide much state information
 -it doesnt let you roll back very easily, or fix what you want

Most people in that group -the CM tool authors - prefer to automate RPMinstall/rollback themselves, so they can stay in control.

Having a look at how our build.xml file manages test RPMs -that is fromthe build VMware image to a clean test image, we <scp> and then <ssh>the operations



    <scp remoteToDir="${rpm.ssh.path}"
        passphrase="${rpm.ssh.passphrase}"
        keyfile="${rpm.ssh.keyfile}"
        trust="${rpm.ssh.trust}"
        verbose="${rpm.ssh.verbose}">
      <fileset refid="rpm.upload.fileset"/>
    </scp>



  <target name="rpm-remote-install-all" depends="rpm-upload">
    <rootssh

command="cd ${rpm.full.ssh.dir};rpm --upgrade --force${rpm.verbosity} smartfrog-*.rpm"

        outputProperty="rpm.result.all"/>
    <validate-rpm-result result="${rpm.result.all}"/>
  </target>


The <rootssh> preset runs a remote root command

    <presetdef name="rpmssh">
      <sshexec host="${rpm.ssh.server}"
          username="${rpm.ssh.user}"
          passphrase="${rpm.ssh.passphrase}"
          trust="${rpm.ssh.trust}"
          keyfile="${rpm.ssh.keyfile}"
          timeout="${ssh.command.timeout}"
          />
    </presetdef>

    <presetdef name="rootssh">
      <rpmssh
          username="root"
          timeout="${ssh.rpm.command.timeout}"
          />
    </presetdef>

More troublesome is how we check for errors. No simple exit code here,instead I have to scan for strings in the response.


    <macrodef name="validate-rpm-result">
      <attribute name="result"/>
      <sequential>
        <echo>
          @{result}
        </echo>
        <fail>
          <condition>
            <contains
                string="@{result}"
                substring="does not exist"/>
          </condition>
          The rpm contains files belonging to an unknown user.
        </fail>
      </sequential>
    </macrodef>

Then, once everything is installed, I do something even scarier - runlots of query commands and look for error strings. I do need to automatethis better; its on my todo list and one of the things I might use as atest project would be automating creating custom hadoop EC2 images,something like


-bring up the image
-push out new RPMs and ssh keys, including JVM versions.
-create the new AMI
-set the AMI access rights up.
-delete the old one.

Like I said, on the todo list.




--
Steve Loughran                  http://www.1060.org/blogxter/publish/5
Author: Ant in Action           http://antbook.org/

Re: Hadoop Cluster Administration Tools?

Reply via email to