-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Tuesday July 05, 2011, Sunil Venkatesh wrote:
> Hi Josh,
> 
> I was able to get the following things done in respect to getting VCL to
> work on POWER.
> 
> 1. Made modifications in the xcat tables to get the capture process
> working with statelite images instead of stateless images. Particularly
> the noderes & bootparams table.
> 
> 2. Used partimage to capture the images (did NOT set usepartimageng to 1).
> 
> -rw-r--r-- 1 root root    0 Jul  5 16:38 compute.img.capturedone
> -rw-r--r-- 1 root root    0 Jul  5 15:58 compute.img.capturefailed
> -rw------- 1 root root 6.5M Jul  5 16:07 compute-parta2.gz
> -rw------- 1 root root 679M Jul  5 16:10 compute-parta3.gz
> -rw------- 1 root root  23M Jul  5 16:38 compute-parta6.gz
> -rw-r--r-- 1 root root  512 Jul  5 16:07 compute-sda.mbr
> -rw-r--r-- 1 root root  363 Jul  5 16:07 compute-sda.sfdisk
> 
> 
> 2 partitions including the boot partition present on the blade were
> captured under /install/image/ppc64/. Initially, RHEL 5 was installed on
> a 600 GB partition due to which the capture process failed. The image of
> the partition was generated once the partition size was reduced to 6GB.
> Is it necessary for me to use partimage-ng instead of partimage itself?

Are you asking if you need to use partimage-ng for partitions that are 600GB?  
If so, I don't really know.  We've never dealt with partitions that large.
 
> When proceeding further with "vcld --setup", the script was not able to
> find the images that were created using partimage. The options that are
> provided in the script does not allow for selecting an architecture
> other than x86/x86_64.

You'll need to modify the vcld image.pm module.  Look in 
/usr/local/vcl/lib/VCL.  In image.pm, look for the function 
'setup_capture_base_image'; then, find 'my @architecture_choices' and add 
'ppc' as another option.

> Also, in the error log vcld is looking for
> 
> /opt/xcat/share/xcat/install/image/rh5image-power010701bi34-v0.tmpl
> 
> and cannot find the template file. Should the template file that needs
> to be accessed in this case be createimage.ppc64.tmpl?

This is actually a check to make sure the image doesn't already exist before 
trying to capturing it.  So, it is good that it doesn't find it.

It sounds like you're almost there.  Great work!

Josh

> I have attached a log at the end of the mail. I am not sure where I have
> gone wrong with the VCL configuration.
> 
> -Sunil
> 
> -----
> 
> rh5image-power010701bi34-v0 image creation failed
> ------------------------------------------------------------------------
> time: 2011-07-05 11:03:25
> caller: image.pm:reservation_failed(385)
> ( 0) image.pm, reservation_failed (line: 385)
> (-1) image.pm, process (line: 167)
> (-2) vcld, make_new_child (line: 568)
> (-3) vcld, main (line: 346)
> ------------------------------------------------------------------------
> management node: web1.bluegrit.cs.umbc.edu
> reservation PID: 9866
> parent vcld PID: 19110
> 
> request ID: 30
> reservation ID: 30
> request state/laststate: image/image
> request start time: 2011-07-05 11:03:20
> request end time: 2011-07-05 12:03:20
> for imaging: no
> log ID: none
> 
> computer: power01.bluegrit.cs.umbc.edu
> computer id: 2
> computer type: blade
> computer eth0 MAC address:<undefined>
> computer eth1 MAC address:<undefined>
> computer private IP address: 172.20.106.1
> computer public IP address: 172.20.106.1
> computer in block allocation: no
> provisioning module: VCL::Module::Provisioning::xCAT2
> 
> image: rh5image-power010701bi34-v0
> image display name: power010701bi
> image ID: 34
> image revision ID: 34
> image size: 1450 MB
> use Sysprep: yes
> root access: yes
> image owner ID: 1
> image owner affiliation: Local
> image revision date created: 2011-07-05 11:03:25
> image revision production: yes
> OS module: VCL::Module::OS::Linux
> 
> user: admin
> user name: vcl admin
> user ID: 1
> user affiliation: Local
> ------------------------------------------------------------------------
> RECENT LOG ENTRIES FOR THIS PROCESS:
> 2011-07-05
> 11:03:25|9866|30:30|image|Module.pm:create_os_object(304)|VCL::Module::OS:
> :Linux OS object created for rh5image-power010701bi34-v0, address: 88fb070
> 2011-07-05 11:03:25|9866|30:30|image|xCAT.pm:initialize(110)|XCATROOT
> environment variable is not set, using /opt/xcat 2011-07-05
> 11:03:25|9866|30:30|image|xCAT.pm:initialize(128)|xCAT root path found:
> /opt/xcat 2011-07-05
> 11:03:25|9866|30:30|image|xCAT.pm:initialize(130)|xCAT module initialized
> 2011-07-05 11:03:25|9866|30:30|image|xCAT2.pm:initialize(110)|XCATROOT
> environment variable is not set, using /opt/xcat 2011-07-05
> 11:03:25|9866|30:30|image|xCAT2.pm:initialize(128)|xCAT root path found:
> /opt/xcat 2011-07-05
> 11:03:25|9866|30:30|image|xCAT2.pm:initialize(130)|xCAT module initialized
> 2011-07-05
> 11:03:25|9866|30:30|image|Module.pm:create_provisioning_object(420)|VCL::M
> odule::Provisioning::xCAT2 module loaded 2011-07-05
> 11:03:25|9866|30:30|image|Module.pm:create_mn_os_object(335)|management
> node OS object has already been created, address: 88f23b0, returning 1
> 2011-07-05
> 11:03:25|9866|30:30|image|Module.pm:new(200)|VCL::Module::Provisioning::xC
> AT2 object created for computer power01, address: 88fb0e0 2011-07-05
> 11:03:25|9866|30:30|image|xCAT2.pm:initialize(110)|XCATROOT environment
> variable is not set, using /opt/xcat 2011-07-05
> 11:03:25|9866|30:30|image|xCAT2.pm:initialize(128)|xCAT root path found:
> /opt/xcat 2011-07-05
> 11:03:25|9866|30:30|image|xCAT2.pm:initialize(130)|xCAT module initialized
> 2011-07-05
> 11:03:25|9866|30:30|image|Module.pm:create_provisioning_object(426)|VCL::M
> odule::Provisioning::xCAT2 provisioner object created for power01, address:
> 88fb0e0 2011-07-05
> 11:03:25|9866|30:30|image|State.pm:initialize(126)|returning 1 2011-07-05
> 11:03:25|9866|30:30|image|vcld:make_new_child(565)|VCL::image object
> created and initialized 2011-07-05
> 11:03:25|9866|30:30|image|utils.pm:mail(1268)|SUCCESS -- Sending mail
> To:shru...@gmail.com, VCL IMAGE Creation Started:
> rh5image-power010701bi34-v0 2011-07-05
> 11:03:25|9866|30:30|image|xCAT.pm:does_image_exist(2434)|image OS install
> type: partimage 2011-07-05
> 11:03:25|9866|30:30|image|xCAT.pm:_get_image_repository_path(2910)|managem
> ent node identifier argument was not specified
> 
> 2011-07-05 11:03:25|9866|30:30|image|
xCAT.pm:_get_image_repository_path(2932)|attempting to determine repository 
path for image on web1.bluegrit.cs.umbc.edu:
> |9866|30:30|image| image id: 34
> |9866|30:30|image| OS name: rh5image
> |9866|30:30|image| OS type: linux
> |9866|30:30|image| OS install type: partimage
> |9866|30:30|image| OS source path: image
> |9866|30:30|image| architecture: x86_64
> 
> 2011-07-05
> 11:03:25|9866|30:30|image|xCAT.pm:_get_image_repository_path(2996)|did not
> find any images under /tftpboot/xcat//linux_image/x86_64 on
> web1.bluegrit.cs.umbc.edu 2011-07-05
> 11:03:25|9866|30:30|image|xCAT.pm:_get_image_repository_path(3006)|returni
> ng repository path for web1.bluegrit.cs.umbc.edu:
> /tftpboot/xcat//image/x86_64 2011-07-05
> 11:03:25|9866|30:30|image|xCAT.pm:does_image_exist(2444)|image repository
> path: /tftpboot/xcat//image/x86_64
> 
> 2011-07-05 11:03:25|9866|30:30|image|utils.pm:run_command(9010)|executed 
command: du -c /tftpboot/xcat//image/x86_64/*rh5image-power010701bi34-v0* 2>&1 
| grep total 2>&1, pid: 9877, exit status: 0, output:
> |9866|30:30|image| 0 total
> 
> 2011-07-05 11:03:25|9866|30:30|image|xCAT.pm:does_image_exist(2506)|image
> does NOT exist: rh5image-power010701bi34-v0 2011-07-05
> 11:03:25|9866|30:30|image|xCAT2.pm:_get_image_template_path(2084)|manageme
> nt node identifier argument was not specified
> 
> 2011-07-05 11:03:25|9866|30:30|image|
xCAT2.pm:_get_image_template_path(2115)|attempting to determine template path 
for image:
> |9866|30:30|image| image name: rh5image-power010701bi34-v0
> |9866|30:30|image| OS install type: partimage
> |9866|30:30|image| OS source path: image
> |9866|30:30|image| xCAT 2.x OS source path: image
> 
> 2011-07-05
> 11:03:25|9866|30:30|image|xCAT2.pm:_get_image_template_path(2123)|returnin
> g: /opt/xcat/share/xcat/install/image 2011-07-05
> 11:03:25|9866|30:30|image|xCAT.pm:does_image_exist(2518)|template
> repository path for rh5image-power010701bi34-v0:
> /opt/xcat/share/xcat/install/image 2011-07-05
> 11:03:25|9866|30:30|image|xCAT.pm:does_image_exist(2530)|template file
> does not exist:
> /opt/xcat/share/xcat/install/image/rh5image-power010701bi34-v0.tmpl
> 2011-07-05 11:03:25|9866|30:30|image|xCAT.pm:does_image_exist(2570)|image
> rh5image-power010701bi34-v0 does NOT exist on this management node
> 2011-07-05 11:03:25|9866|30:30|image|image.pm:process(145)|image
> rh5image-power010701bi34-v0 does not exist in the repository 2011-07-05
> 11:03:25|9866|30:30|image|DataStructure.pm:_automethod(834)|data structure
> updated: $self->request_data->{reservation}{30}{image}{lastupdate}
> 
> |9866|30:30|image| image_lastupdate = 2011-07-05 11:03:25
> 
> 2011-07-05 11:03:25|9866|30:30|image|DataStructure.pm:_automethod(834)|data
> structure updated:
> $self->request_data->{reservation}{30}{imagerevision}{datecreated}
> 
> |9866|30:30|image| imagerevision_date_created = 2011-07-05 11:03:25
> 
> 2011-07-05 11:03:25|9866|30:30|image|image.pm:process(161)|calling
> provisioning module's capture() subroutine 2011-07-05
> 11:03:25|9866|30:30|image|xCAT2.pm:capture(776)|image=rh5image-power010701
> bi34-v0, computer=power01
> 
> 2011-07-05 11:03:25|9866|30:30|image|utils.pm:run_ssh_command(5380)|
executing SSH command on power01:
> |9866|30:30|image| /usr/bin/ssh -i /etc/vcl/vcl.key  -o
> |StrictHostKeyChecking=no -l root -p 22 -x power01 'chown root
> |currentimage.txt; chmod 777 currentimage.txt' 2>&1
> 
> 2011-07-05 11:03:25|9866|30:30|image|utils.pm:run_ssh_command(5464)|
run_ssh_command output:
> |9866|30:30|image| Permission denied, please try again.
> |9866|30:30|image| Permission denied, please try again.
> |9866|30:30|image| Permission denied (publickey,gssapi-with-mic,password).
> 
> 2011-07-05 11:03:25|9866|30:30|image|utils.pm:run_ssh_command(5474)|SSH 
command executed on power01, command:
> |9866|30:30|image| /usr/bin/ssh -i /etc/vcl/vcl.key  -o
> |StrictHostKeyChecking=no -l root -p 22 -x power01 'chown root
> |currentimage.txt; chmod 777 currentimage.txt' 2>&1 9866|30:30|image|
> |returning (255, "Permission denied, please try ...")
> 
> 2011-07-05
> 11:03:25|9866|30:30|image|utils.pm:write_currentimage_txt(5685)|updated
> ownership and permissions on currentimage.txt
> 
> 2011-07-05 11:03:25|9866|30:30|image|utils.pm:run_ssh_command(5380)|
executing SSH command on power01:
> |9866|30:30|image| /usr/bin/ssh -i /etc/vcl/vcl.key  -o
> |StrictHostKeyChecking=no -l root -p 22 -x power01 'echo -e
> |"rh5image-power010701bi34-v0\r\nid=34\r\nprettyname=power010701bi\r\nimag
> |erevision_id=34\r\nimagerevision_datecreated=2011-07-05
> |11:03:25\r\ncomputer_id=2\r\ncomputer_hostname=power01.bluegrit.cs.umbc.e
> |du">  currentimage.txt&&  cat currentimage.txt' 2>&1
> 
> 2011-07-05 11:03:25|9866|30:30|image|utils.pm:run_ssh_command(5464)|
run_ssh_command output:
> |9866|30:30|image| Permission denied, please try again.
> |9866|30:30|image| Permission denied, please try again.
> |9866|30:30|image| Permission denied (publickey,gssapi-with-mic,password).
> 
> 2011-07-05 11:03:25|9866|30:30|image|utils.pm:run_ssh_command(5474)|SSH 
command executed on power01, command:
> |9866|30:30|image| /usr/bin/ssh -i /etc/vcl/vcl.key  -o
> |StrictHostKeyChecking=no -l root -p 22 -x power01 'echo -e
> |"rh5image-power010701bi34-v0\r\nid=34\r\nprettyname=power010701bi\r\nimag
> |erevision_id=34\r\nimagerevision_datecreated=2011-07-05
> |11:03:25\r\ncomputer_id=2\r\ncomputer_hostname=power01.bluegrit.cs.umbc.e
> |du">  currentimage.txt&&  cat currentimage.txt' 2>&1 9866|30:30|image|
> |returning (255, "Permission denied, please try ...") 9866|30:30|image|
> |---- WARNING ----
> |9866|30:30|image| 2011-07-05
> |11:03:25|9866|30:30|image|utils.pm:write_currentimage_txt(5699)|failed to
> |create currentimage.txt file on power01: 9866|30:30|image| Permission
> |denied, please try again.
> |9866|30:30|image| Permission denied, please try again.
> |9866|30:30|image| Permission denied (publickey,gssapi-with-mic,password).
> |9866|30:30|image| ( 0) utils.pm, write_currentimage_txt (line: 5699)
> |9866|30:30|image| (-1) xCAT2.pm, capture (line: 779)
> |9866|30:30|image| (-2) image.pm, process (line: 162)
> |9866|30:30|image| (-3) vcld, make_new_child (line: 568)
> |9866|30:30|image| (-4) vcld, main (line: 346)
> |9866|30:30|image| ---- WARNING ----
> |9866|30:30|image| 2011-07-05
> |11:03:25|9866|30:30|image|xCAT2.pm:capture(783)|unable to update
> |currentimage.txt on power01 9866|30:30|image| ( 0) xCAT2.pm, capture
> |(line: 783)
> |9866|30:30|image| (-1) image.pm, process (line: 162)
> |9866|30:30|image| (-2) vcld, make_new_child (line: 568)
> |9866|30:30|image| (-3) vcld, main (line: 346)
> |9866|30:30|image| ---- WARNING ----
> |9866|30:30|image| 2011-07-05
> |11:03:25|9866|30:30|image|image.pm:process(166)|rh5image-power010701bi34-
> |v0 image failed to be captured by provisioning module 9866|30:30|image| (
> |0) image.pm, process (line: 166)
> |9866|30:30|image| (-1) vcld, make_new_child (line: 568)
> |9866|30:30|image| (-2) vcld, main (line: 346)
> 
> 2011-07-05
> 11:03:25|9866|30:30|image|DataStructure.pm:get_computer_private_ip_address
> (1581)|attempting to retrieve private IP address for computer: power01
> 2011-07-05
> 11:03:25|9866|30:30|image|DataStructure.pm:get_computer_private_ip_address
> (1585)|retrieved contents of /etc/hosts on this management node, contains
> 158 lines 2011-07-05
> 11:03:25|9866|30:30|image|DataStructure.pm:get_computer_private_ip_address
> (1645)|returning IP address from /etc/hosts file: 172.20.106.1 2011-07-05
> 11:03:25|9866|30:30|image|utils.pm:is_inblockrequest(6163)|zero rows were
> returned from database select 2011-07-05
> 11:03:25|9866|30:30|image|DataStructure.pm:get_image_affiliation_name(2035
> )|image owner id: 1 2011-07-05
> 11:03:25|9866|30:30|image|utils.pm:getnewdbh(2709)|database requested
> (information_schema) does not match handle stored in $ENV{dbh} (vcl:)
> 2011-07-05 11:03:25|9866|30:30|image|utils.pm:getnewdbh(2760)|database
> handle stored in $ENV{dbh} 2011-07-05
> 11:03:25|9866|30:30|image|DataStructure.pm:retrieve_user_data(1352)|attemp
> ting to retrieve and store data for user: user.id = '1' 2011-07-05
> 11:03:25|9866|30:30|image|utils.pm:getnewdbh(2709)|database requested
> (vcl) does not match handle stored in $ENV{dbh} (information_schema:)
> 2011-07-05 11:03:25|9866|30:30|image|utils.pm:getnewdbh(2760)|database
> handle stored in $ENV{dbh} 2011-07-05
> 11:03:25|9866|30:30|image|DataStructure.pm:retrieve_user_data(1415)|data
> has been retrieved for user: admin (id: 1)
> 
> On 6/24/11 10:13 AM, Josh Thompson wrote:
> > Sunil,
> > 
> > "nodeset<nodename>  image" sets up all the xCAT stuff so that the next
> > time the node is booted, it will boot the stateless/statelite image and
> > capture an image of the node.
> > 
> > Can you double check that you have 'os' in the nodetype table set to
> > image for the node you are using?  If you look in the partimageng.pm
> > xCAT module, you see toward the top where it registers the
> > "handled_commands".  The "mk" gets stripped off.  So, that module is
> > registering "install" and "image" for os type = "image".  As long as you
> > have os in the nodetype table set to image, it should be using that
> > module.
> > 
> > You will need to make sure you have all of the required files in
> > locations using 'ppc64' as the arch.
> > 
> > Josh
> > 
> > On Wednesday June 22, 2011, Sunil Venkatesh wrote:
> >> Hi,
> >> 
> >> Update !
> >> 
> >> I was able to fix the problem that I was facing with the scripts by
> >> disabling the firewall. But, I still have a problem with the command-
> >> 
> >> nodeset<nodename>  image
> >> 
> >> Unless this error is fixed, I don't think partimage will work. Am I
> >> right here?
> >> 
> >> Thanks,
> >> Sunil
> >> 
> >> On Tue, Jun 21, 2011 at 3:13 PM, Sunil Venkatesh<suni...@umbc.edu>  
wrote:
> >>> Josh,
> >>> 
> >>> I have reached a point where I am able to boot the ppc using the
> >>> statelite images created using genimage. But, I was wondering how
> >>> significant the following command is.
> >>> 
> >>> nodeset<nodename>  image
> >>> 
> >>> I got the same error that Prem had mentioned.
> >>> 
> >>> 
> >>> power01: Error: Unable to identify plugin for this command, check
> >>> relevant tables: nodetype.os
> >>> Error: Some nodes failed to set up image resources, aborting
> >>> 
> >>> I tried changing the 'os' field to 'image' under nodetype, that doesn't
> >>> seem to help. I get the same error even after the change. 'arch' in my
> >>> case is set to 'ppc64'.
> >>> 
> >>> 
> >>> Also, I think partimage plugin needs to be changed to support the ppc
> >>> architecture, from what you had mentioned in the other thread.
> >>> 
> >>> I am not sure what the command 'nodeset<nodename>  image' does, but, I
> >>> am able to boot the statelite images by making changes to the yaboot
> >>> configuration files. The ppc blade currently uses LVM, that needs to
> >>> be replaced with ext2/ext3 from what I read from the other thread, am
> >>> I right? Also, just out of curiosity I left the statelite image to
> >>> boot with my current setting. I can see the xcat script throwing an
> >>> error-
> >>> 
> >>> /opt/xcat/xcatdsklspost: line 229: /xcatpost/getpostscript.awk: No such
> >>> file or directory
> >>> /tmp/mypostscript: line 16: updateflag.awk: command not found
> >>> 
> >>> both getpostscript.awk&  updateflag.awk are not found in the rootimg
> >>> created by genimage. Is there any place I could find these scripts?
> >>> 
> >>> Also, please correct me if there is anything wrong with the procedure I
> >>> am following.
> >>> 
> >>> 
> >>> Thanks in advance.
> >>> 
> >>> Regards,
> >>> Sunil
> >>> 
> >>> On 6/13/11 4:13 PM, Josh Thompson wrote:
> >>>> Sunil,
> >>>> 
> >>>>   From what I remember, I didn't have to do much to the rootimg.gz
> >>>>   image to
> >>>> 
> >>>> make
> >>>> it work.  I created the files I supply before xCAT started using
> >>>> "statelite"
> >>>> instead of "stateless".  I think statelite uses NFS to mount the
> >>>> image, and
> >>>> stateless uses an image file downloaded to the node and run out of
> >>>> RAM.
> >>>> 
> >>>>   Since
> >>>> 
> >>>> generating a statelite image is pretty straightforward use of xCAT,
> >>>> you may
> >>>> want to ask on the xcat-user email list for help with it.
> >>>> 
> >>>> Unless you can have the admins of the other dhcp server on your
> >>>> network exclude the MAC addresses of your blades, you'll need to
> >>>> create a separate private network to control your VCL stuff, either
> >>>> physically or with VLANs.
> >>>> 
> >>>> If they can exclude the MACs, you can set up the dhcp server on your
> >>>> management node to only answer to requests from your blades.
> >>>> 
> >>>> Josh
> >>>> 
> >>>> On Monday June 13, 2011, Sunil Venkatesh wrote:
> >>>>> Josh,
> >>>>> 
> >>>>> Again, Thank you for your valuable inputs. I have got to the point
> >>>>> where I can get the compute node to boot using the stateless images.
> >>>>> I had to manually configure the netboot since we already had a DHCP
> >>>>> server which is not the same as our Management node. Since our setup
> >>>>> is not in an isolated environment, I could not let xcat handle the
> >>>>> dhcp&   netboot configuration (it messed up out network
> >>>>> configuration when i let xcat handle it,we had 2 dhcp servers
> >>>>> running at that point). Are you aware of any way to let xcat handle
> >>>>> such scenarios?
> >>>>> 
> >>>>> Although I am able to get the compute node to boot with the kernel
> >>>>> image&   initrd, and NFS mount the rootimg that was generated using
> >>>>> 'genimage', I am getting the following error on the compute node's
> >>>>> console -
> >>>>> 
> >>>>>       FATAL error: could not get the entries from litefile table...
> >>>>> 
> >>>>> after going thru the init-scripts, I found out 'xCATCmd' binary is
> >>>>> not present in the rootimg. I am currently checking the xcat
> >>>>> packages for its availability. If you know the procedure to get it
> >>>>> onto the compute node, please let me know the same.
> >>>>> 
> >>>>> Appreciate your support.
> >>>>> 
> >>>>> Thanking you,
> >>>>> Sunil
> >>>>> 
> >>>>> On 6/8/11 9:02 AM, Josh Thompson wrote:
> >>>>>> Sunil,
> >>>>>> 
> >>>>>> I don't recall seeing any documentation on those parts.  I had to
> >>>>>> poke around looking at parts of xCAT to see how it worked.  It's
> >>>>>> been a few years since I did that; so, I don't remember much about
> >>>>>> the process. My recommendation would be to start looking at things
> >>>>>> in the rootimg.gz image.  Looking at it now, I see that
> >>>>>> /opt/xcat/xcatdsklspost gets run when rootimg.gz boots.  It looks
> >>>>>> like it downloads all of the postscripts from the management node
> >>>>>> and then run getpostscript.awk which issues a command to xcatd to
> >>>>>> get the primary postscript for that machine.  I've forgotten how
> >>>>>> xcatd then builds the primary postscript. I do remember that in the
> >>>>>> partimageng.pm module, I had it add the partimageng postscript.
> >>>>>> 
> >>>>>> So, you'll really have to start digging through how the xcat
> >>>>>> postscript system works.
> >>>>>> 
> >>>>>> Josh
> >>>>>> 
> >>>>>> On Tuesday June 07, 2011, Sunil Venkatesh wrote:
> >>>>>>> Josh,
> >>>>>>> 
> >>>>>>> Is there any place I could find some details on
> >>>>>>> 
> >>>>>>> "... /Once the compute node is booted with the stateless
> >>>>>>> image, it uses NFS to mount some things from the management node,
> >>>>>>> and then runs some xcat postscripts,/.... "
> >>>>>>> 
> >>>>>>> I have the stateless images ready with partimage compiled for PPC.
> >>>>>>> For the compute node (power 7) to boot using the stateless images,
> >>>>>>> i need to
> >>>>>>> configure the yaboot instead of pxeboot (which is specific to x86).
> >>>>>>> I wanted to know where in the startup files the execution of
> >>>>>>> partimage and
> >>>>>>> NFS mount is configured. Is it configured by the "genimage" command
> >>>>>>> itself? Considering the way in which the nodes are configured in
> >>>>>>> the network, it would not be a good idea to let xcat take care of
> >>>>>>> configuring the details like DHCPD for netboot. So, I need to make
> >>>>>>> changes to the configuration files manually, which is why this
> >>>>>>> query came up.
> >>>>>>> 
> >>>>>>> Thanks in advance.
> >>>>>>> 
> >>>>>>> Regards,
> >>>>>>> Sunil
> >>>>>>> 
> >>>>>>> On 6/1/11 1:39 PM, Josh Thompson wrote:
> >>>>>>>> Sunil,
> >>>>>>>> 
> >>>>>>>> The "stateless" image I refer to is what is actually booted on the
> >>>>>>>> compute node containing the image to be captured.  It's called
> >>>>>>>> stateless because it is loaded completely in RAM and does not
> >>>>>>>> maintain any state when a reboot occurs.
> >>>>>>>> 
> >>>>>>>> The partimage binary is part of this stateless image and actually
> >>>>>>>> runs on the compute node.  It does not run on the management node.
> >>>>>>>> The management node does not have block level access to the disk
> >>>>>>>> on the compute node to be able to capture the image from the
> >>>>>>>> disk.
> >>>>>>>> 
> >>>>>>>> I'll try to describe the process a little better.  The management
> >>>>>>>> node issues a reboot command to the compute node.  The compute
> >>>>>>>> node uses PXE
> >>>>>>>> to load and boot a kernel (vmlinuz), initial RAM disk
> >>>>>>>> (initrd.img), and
> >>>>>>>> a root filesystem (rootimg.gz) from the management node.  All
> >>>>>>>> three of these together make up the stateless image.  Once the
> >>>>>>>> compute node is booted with the stateless image, it uses NFS to
> >>>>>>>> mount some things from the management node, and then runs some
> >>>>>>>> xcat
> >>>>>>>> postscripts, one of which is the partimageng postscript.  This
> >>>>>>>> postscript determines what partitions are on the compute node and,
> >>>>>>>> depending on how the postscript
> >>>>>>>> is configured, uses partimage or partimageng to capture an image
> >>>>>>>> of the
> >>>>>>>> compute node disk that is then saved to the management node. When
> >>>>>>>> it is
> >>>>>>>> finished capturing the image, it notifies xcat on the management
> >>>>>>>> node and then reboots.  xcat reconfigures itself to tell the
> >>>>>>>> compute node to
> >>>>>>>> boot off of disk at next boot.  When the compute node comes up, it
> >>>>>>>> uses
> >>>>>>>> PXE to ask the management node how to boot.  The management node
> >>>>>>>> tells it to boot off of disk.
> >>>>>>>> 
> >>>>>>>> I hope that clarifies how the system works.  If any of it is
> >>>>>>>> unclear, please ask for further clarification.
> >>>>>>>> 
> >>>>>>>> Josh
> >>>>>>>> 
> >>>>>>>> On Wednesday June 01, 2011, Sunil Venkatesh wrote:
> >>>>>>>>> Josh,
> >>>>>>>>> 
> >>>>>>>>> I had one more clarification.
> >>>>>>>>> 
> >>>>>>>>> partimage binaries run in the management node to capture an
> >>>>>>>>> (stateless) image from the compute node right? In that case, is
> >>>>>>>>> there a need for these binaries to go into the rootimg.gz??
> >>>>>>>>> 
> >>>>>>>>> My assumption is, partimage runs on the management node (an intel
> >>>>>>>>> blade in our case) to capture a stateless image from a compute
> >>>>>>>>> node (a power 7 blade) and stores these images under " /install
> >>>>>>>>> " of the management node. Please correct me if I am wrong here.
> >>>>>>>>> 
> >>>>>>>>> Regards,
> >>>>>>>>> Sunil
> >>>>>>>>> 
> >>>>>>>>> On 6/1/11 9:58 AM, Josh Thompson wrote:
> >>>>>>>>>> -----BEGIN PGP SIGNED MESSAGE-----
> >>>>>>>>>> Hash: SHA1
> >>>>>>>>>> 
> >>>>>>>>>> On Tuesday May 31, 2011, Sunil Venkatesh wrote:
> >>>>>>>>>>> Hi,
> >>>>>>>>>>> 
> >>>>>>>>>>> I used the steps that were mentioned under
> >>>>>>>>>>> 
> >>>>>>>>>>> https://cwiki.apache.org/**confluence/display/VCL/Adding+**
> >>>>>>>>>>> support+for+p<https://cwiki.apache.org/confluence/display/VCL/A
> >>>>>>>>>>> dd ing+support+for+p>  ar ti mag e+and+partimage-
> >>>>>>>>>>> ng+to+xCAT+2.x+%28unofficial%**29
> >>>>>>>>>>> 
> >>>>>>>>>>> to enable partimage support for xcat. I wasn't sure if I need
> >>>>>>>>>>> to change references to x86&      x86_64 (as directories) to
> >>>>>>>>>>> reflect the
> >>>>>>>>>>> ppc architecture, as the web page says "The architecture for
> >>>>>>>>>>> the node must always be set to x86 for this..". I have with me
> >>>>>>>>>>> the vmlinuz (kernel image) and initrd for the capture process.
> >>>>>>>>>>> The 2 nodeset commands
> >>>>>>>>>> 
> >>>>>>>>>> By this, do you mean you have vmlinuz and initrd for your power
> >>>>>>>>>> blades, not the ones linked to off of the page you listed above?
> >>>>>>>>>> If you do, that's a good start.  However, you'll also need
> >>>>>>>>>> rootimg.gz. rootimg.gz is the root filesystem for the stateless
> >>>>>>>>>> image.  It also contains the partimage and partimageng binaries.
> >>>>>>>>>> Assuming partimage or partimageng can actually capture
> >>>>>>>>>> partitions from power systems, you'll need to compile at least
> >>>>>>>>>> one of them to run on power.  For the rootimg.gz image I
> >>>>>>>>>> provided, I compiled them statically so that I didn't have to
> >>>>>>>>>> worry about including any library dependencies in rootimg.gz.
> >>>>>>>>>> 
> >>>>>>>>>> It would be a good idea to research how to use xcat's genimage
> >>>>>>>>>> command to generate stateless images to learn how to do this.
> >>>>>>>>>> 
> >>>>>>>>>> If there's any part of the above that you don't fully
> >>>>>>>>>> understand, please ask me to clarify it.  Until you have a
> >>>>>>>>>> stateless image that you can deploy to your power blades,
> >>>>>>>>>> there's no point in trying to debug any VCL specific items.
> >>>>>>>>>> 
> >>>>>>>>>> Josh
> >>>>>>>>>> - --
> >>>>>>>>>> - ------------------------------**-
> >>>>>>>>>> Josh Thompson
> >>>>>>>>>> VCL Developer
> >>>>>>>>>> North Carolina State University
> >>>>>>>>>> 
> >>>>>>>>>> my GPG/PGP key can be found at pgp.mit.edu
> >>>>>>>>>> -----BEGIN PGP SIGNATURE-----
> >>>>>>>>>> Version: GnuPG v2.0.17 (GNU/Linux)
> >>>>>>>>>> 
> >>>>>>>>>> iEYEARECAAYFAk3mRYsACgkQV/**LQcNdtPQNnVgCbB9ZFJn0+C45RC/**
> >>>>>>>>>> g75RqGZY/j
> >>>>>>>>>> PZYAniP2Eam7nxgiDWUnp5sKPYPO4O**Ma
> >>>>>>>>>> =exBV
> >>>>>>>>>> -----END PGP SIGNATURE-----
- -- 
- -------------------------------
Josh Thompson
VCL Developer
North Carolina State University

my GPG/PGP key can be found at pgp.mit.edu
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.17 (GNU/Linux)

iEYEARECAAYFAk4VzQoACgkQV/LQcNdtPQM8YQCePg3O5vp5AXEhiO+5aIRIUO/S
6IgAn1Xt4ytGnmxpfJVteCScFi0dRz15
=Yls1
-----END PGP SIGNATURE-----

Reply via email to