Josh, Could you provide me with the links to the resources of the VCL workshop? If there is a way to witness the workshop while it is in progress, that would help too.
Regards, Sunil On Jul 7, 2011, at 2:03 PM, Josh Thompson wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Sunil, > > On Thursday July 07, 2011, Sunil Venkatesh wrote: >> Thanks Josh. My professor was asking about the details of VCL workshop >> in NC. Are you aware of these details? > > The workshop is hosted by NCSU. It takes people from an introduction to VCL > to actually installing and managing it. It is already full, but I think > recordings of the sessions may be available when it is over. > >> >> Please bare with my comments inline. > > Responses also inline. > >> On 7/7/11 11:13 AM, Josh Thompson wrote: >>> -----BEGIN PGP SIGNED MESSAGE----- >>> Hash: SHA1 >>> >>> On Tuesday July 05, 2011, Sunil Venkatesh wrote: >>>> Hi Josh, >>>> >>>> I was able to get the following things done in respect to getting VCL to >>>> work on POWER. >>>> >>>> 1. Made modifications in the xcat tables to get the capture process >>>> working with statelite images instead of stateless images. Particularly >>>> the noderes& bootparams table. >>>> >>>> 2. Used partimage to capture the images (did NOT set usepartimageng to >>>> 1). >>>> >>>> -rw-r--r-- 1 root root 0 Jul 5 16:38 compute.img.capturedone >>>> -rw-r--r-- 1 root root 0 Jul 5 15:58 compute.img.capturefailed >>>> -rw------- 1 root root 6.5M Jul 5 16:07 compute-parta2.gz >>>> -rw------- 1 root root 679M Jul 5 16:10 compute-parta3.gz >>>> -rw------- 1 root root 23M Jul 5 16:38 compute-parta6.gz >>>> -rw-r--r-- 1 root root 512 Jul 5 16:07 compute-sda.mbr >>>> -rw-r--r-- 1 root root 363 Jul 5 16:07 compute-sda.sfdisk >>>> >>>> >>>> 2 partitions including the boot partition present on the blade were >>>> captured under /install/image/ppc64/. Initially, RHEL 5 was installed on >>>> a 600 GB partition due to which the capture process failed. The image of >>>> the partition was generated once the partition size was reduced to 6GB. >>>> Is it necessary for me to use partimage-ng instead of partimage itself? >>> >>> Are you asking if you need to use partimage-ng for partitions that are >>> 600GB? If so, I don't really know. We've never dealt with partitions >>> that large. >> >> Here, I am just asking if images captured using partimage are recognized >> by VCL or is it required that I use partimage-ng. From your earlier >> emails to Prem, I could notice that the only difference between >> partimage & partimage-ng (after setting userpartimageng to 1) is the >> former generates images with .gz and the later generates .img. Am I >> right here? Also, I was able to get the 600GB partition captured, since >> the partition was empty, it resulted in a ~17MB image file. > > VCL can deploy images captured with both partimage and partimage-ng. At > NCSU, > we were going to switch to partimage-ng, which is why I added in support for > it, but then we realized we'd have to upgrade all of our management nodes to > xCAT2 at the same time or some of them wouldn't be able to deploy newly > captured images that were captured with partimage-ng (the support for xCAT1.x > can't deploy using partimage-ng). So, we just stuck with partimage. The > captured file format between the two is different. > >>>> When proceeding further with "vcld --setup", the script was not able to >>>> find the images that were created using partimage. The options that are >>>> provided in the script does not allow for selecting an architecture >>>> other than x86/x86_64. >>> >>> You'll need to modify the vcld image.pm module. Look in >>> /usr/local/vcl/lib/VCL. In image.pm, look for the function >>> 'setup_capture_base_image'; then, find 'my @architecture_choices' and add >>> 'ppc' as another option. >> >> As a matter of fact, I tried this step. But, the >> _get_image_repository_path function in >> /usr/local/vcl/lib/VCL/Module/Provisioning/xCAT.pm does not recognize >> the architecture when I choose ppc/ppc64 in the menu. On line 2922 in >> the same file, image_architecture is set to undefined. I think the list >> of supported architectures is stored in some mysql table. I haven't >> checked regarding this, i was trying to get VCL to recognize the images >> as x86/x86_64 by setting up soft links in the search paths of VCL. > > This and your next question are both deeper into the backend code that I've > worked with. Andy or Aaron may be able to answer your questions further. > > Josh > >>>> Also, in the error log vcld is looking for >>>> >>>> /opt/xcat/share/xcat/install/image/rh5image-power010701bi34-v0.tmpl >>>> >>>> and cannot find the template file. Should the template file that needs >>>> to be accessed in this case be createimage.ppc64.tmpl? >>> >>> This is actually a check to make sure the image doesn't already exist >>> before trying to capturing it. So, it is good that it doesn't find it. >> >> If possible, could you please provide me with the details of steps that >> take place here. If there are any documentation available regarding >> this, that would work too. U said "image doesn't already exist before >> trying to capturing it", how does VCL capture the images? does it make >> use of the images that are already generated using partimage? if so, in >> what places does it look for the images? >> >> Sorry for asking too many questions. I could trace the scripts to check >> the flow, but, that would take a lot of time. You have been really >> patient with all my queries, appreciate that. >> >> Thanks >> Sunil >> >>> It sounds like you're almost there. Great work! >>> >>> Josh >>> >>>> I have attached a log at the end of the mail. I am not sure where I have >>>> gone wrong with the VCL configuration. >>>> >>>> -Sunil >>>> >>>> ----- >>>> >>>> rh5image-power010701bi34-v0 image creation failed >>>> ------------------------------------------------------------------------ >>>> time: 2011-07-05 11:03:25 >>>> caller: image.pm:reservation_failed(385) >>>> ( 0) image.pm, reservation_failed (line: 385) >>>> (-1) image.pm, process (line: 167) >>>> (-2) vcld, make_new_child (line: 568) >>>> (-3) vcld, main (line: 346) >>>> ------------------------------------------------------------------------ >>>> management node: web1.bluegrit.cs.umbc.edu >>>> reservation PID: 9866 >>>> parent vcld PID: 19110 >>>> >>>> request ID: 30 >>>> reservation ID: 30 >>>> request state/laststate: image/image >>>> request start time: 2011-07-05 11:03:20 >>>> request end time: 2011-07-05 12:03:20 >>>> for imaging: no >>>> log ID: none >>>> >>>> computer: power01.bluegrit.cs.umbc.edu >>>> computer id: 2 >>>> computer type: blade >>>> computer eth0 MAC address:<undefined> >>>> computer eth1 MAC address:<undefined> >>>> computer private IP address: 172.20.106.1 >>>> computer public IP address: 172.20.106.1 >>>> computer in block allocation: no >>>> provisioning module: VCL::Module::Provisioning::xCAT2 >>>> >>>> image: rh5image-power010701bi34-v0 >>>> image display name: power010701bi >>>> image ID: 34 >>>> image revision ID: 34 >>>> image size: 1450 MB >>>> use Sysprep: yes >>>> root access: yes >>>> image owner ID: 1 >>>> image owner affiliation: Local >>>> image revision date created: 2011-07-05 11:03:25 >>>> image revision production: yes >>>> OS module: VCL::Module::OS::Linux >>>> >>>> user: admin >>>> user name: vcl admin >>>> user ID: 1 >>>> user affiliation: Local >>>> ------------------------------------------------------------------------ >>>> RECENT LOG ENTRIES FOR THIS PROCESS: >>>> 2011-07-05 >>>> >>>> 11:03:25|9866|30:30|image|Module.pm:create_os_object(304)| > VCL::Module::OS: >>>> :Linux OS object created for rh5image-power010701bi34-v0, address: >>>> :88fb070 >>>> >>>> 2011-07-05 11:03:25|9866|30:30|image|xCAT.pm:initialize(110)|XCATROOT >>>> environment variable is not set, using /opt/xcat 2011-07-05 >>>> 11:03:25|9866|30:30|image|xCAT.pm:initialize(128)|xCAT root path found: >>>> /opt/xcat 2011-07-05 >>>> 11:03:25|9866|30:30|image|xCAT.pm:initialize(130)|xCAT module >>>> initialized 2011-07-05 >>>> 11:03:25|9866|30:30|image|xCAT2.pm:initialize(110)|XCATROOT environment >>>> variable is not set, using /opt/xcat 2011-07-05 >>>> 11:03:25|9866|30:30|image|xCAT2.pm:initialize(128)|xCAT root path found: >>>> /opt/xcat 2011-07-05 >>>> 11:03:25|9866|30:30|image|xCAT2.pm:initialize(130)|xCAT module >>>> initialized 2011-07-05 >>>> 11:03:25|9866|30:30|image|Module.pm:create_provisioning_object(420)|VCL: >>>> :M odule::Provisioning::xCAT2 module loaded 2011-07-05 >>>> 11:03:25|9866|30:30|image|Module.pm:create_mn_os_object(335)|management >>>> node OS object has already been created, address: 88f23b0, returning 1 >>>> 2011-07-05 >>>> 11:03:25|9866|30:30|image|Module.pm:new(200)|VCL::Module::Provisioning:: >>>> xC AT2 object created for computer power01, address: 88fb0e0 2011-07-05 >>>> 11:03:25|9866|30:30|image|xCAT2.pm:initialize(110)|XCATROOT environment >>>> variable is not set, using /opt/xcat 2011-07-05 >>>> 11:03:25|9866|30:30|image|xCAT2.pm:initialize(128)|xCAT root path found: >>>> /opt/xcat 2011-07-05 >>>> 11:03:25|9866|30:30|image|xCAT2.pm:initialize(130)|xCAT module >>>> initialized 2011-07-05 >>>> 11:03:25|9866|30:30|image|Module.pm:create_provisioning_object(426)|VCL: >>>> :M odule::Provisioning::xCAT2 provisioner object created for power01, >>>> address: 88fb0e0 2011-07-05 >>>> 11:03:25|9866|30:30|image|State.pm:initialize(126)|returning 1 >>>> 2011-07-05 >>>> 11:03:25|9866|30:30|image|vcld:make_new_child(565)|VCL::image object >>>> created and initialized 2011-07-05 >>>> 11:03:25|9866|30:30|image|utils.pm:mail(1268)|SUCCESS -- Sending mail >>>> To:shru...@gmail.com, VCL IMAGE Creation Started: >>>> rh5image-power010701bi34-v0 2011-07-05 >>>> 11:03:25|9866|30:30|image|xCAT.pm:does_image_exist(2434)|image OS >>>> install type: partimage 2011-07-05 >>>> 11:03:25|9866|30:30|image|xCAT.pm:_get_image_repository_path(2910)|manag >>>> em ent node identifier argument was not specified >>>> >>>> 2011-07-05 11:03:25|9866|30:30|image| >>> >>> xCAT.pm:_get_image_repository_path(2932)|attempting to determine >>> repository >>> >>> path for image on web1.bluegrit.cs.umbc.edu: >>>> |9866|30:30|image| image id: 34 >>>> |9866|30:30|image| OS name: rh5image >>>> |9866|30:30|image| OS type: linux >>>> |9866|30:30|image| OS install type: partimage >>>> |9866|30:30|image| OS source path: image >>>> |9866|30:30|image| architecture: x86_64 >>>> >>>> 2011-07-05 >>>> 11:03:25|9866|30:30|image|xCAT.pm:_get_image_repository_path(2996)|did >>>> not find any images under /tftpboot/xcat//linux_image/x86_64 on >>>> web1.bluegrit.cs.umbc.edu 2011-07-05 >>>> 11:03:25|9866|30:30|image|xCAT.pm:_get_image_repository_path(3006)|retur >>>> ni ng repository path for web1.bluegrit.cs.umbc.edu: >>>> /tftpboot/xcat//image/x86_64 2011-07-05 >>>> 11:03:25|9866|30:30|image|xCAT.pm:does_image_exist(2444)|image >>>> repository path: /tftpboot/xcat//image/x86_64 >>>> >>>> 2011-07-05 11:03:25|9866|30:30|image|utils.pm:run_command(9010)|executed >>> >>> command: du -c /tftpboot/xcat//image/x86_64/*rh5image-power010701bi34-v0* >>> 2>&1 >>> >>> | grep total 2>&1, pid: 9877, exit status: 0, output: >>>> |9866|30:30|image| 0 total >>>> >>>> 2011-07-05 >>>> 11:03:25|9866|30:30|image|xCAT.pm:does_image_exist(2506)|image does NOT >>>> exist: rh5image-power010701bi34-v0 2011-07-05 >>>> 11:03:25|9866|30:30|image|xCAT2.pm:_get_image_template_path(2084)|manage >>>> me nt node identifier argument was not specified >>>> >>>> 2011-07-05 11:03:25|9866|30:30|image| >>> >>> xCAT2.pm:_get_image_template_path(2115)|attempting to determine template >>> path >>> >>> for image: >>>> |9866|30:30|image| image name: rh5image-power010701bi34-v0 >>>> |9866|30:30|image| OS install type: partimage >>>> |9866|30:30|image| OS source path: image >>>> |9866|30:30|image| xCAT 2.x OS source path: image >>>> >>>> 2011-07-05 >>>> 11:03:25|9866|30:30|image|xCAT2.pm:_get_image_template_path(2123)|return >>>> in g: /opt/xcat/share/xcat/install/image 2011-07-05 >>>> 11:03:25|9866|30:30|image|xCAT.pm:does_image_exist(2518)|template >>>> repository path for rh5image-power010701bi34-v0: >>>> /opt/xcat/share/xcat/install/image 2011-07-05 >>>> 11:03:25|9866|30:30|image|xCAT.pm:does_image_exist(2530)|template file >>>> does not exist: >>>> /opt/xcat/share/xcat/install/image/rh5image-power010701bi34-v0.tmpl >>>> 2011-07-05 >>>> 11:03:25|9866|30:30|image|xCAT.pm:does_image_exist(2570)|image >>>> rh5image-power010701bi34-v0 does NOT exist on this management node >>>> 2011-07-05 11:03:25|9866|30:30|image|image.pm:process(145)|image >>>> rh5image-power010701bi34-v0 does not exist in the repository 2011-07-05 >>>> 11:03:25|9866|30:30|image|DataStructure.pm:_automethod(834)|data >>>> structure updated: >>>> $self->request_data->{reservation}{30}{image}{lastupdate} >>>> >>>> |9866|30:30|image| image_lastupdate = 2011-07-05 11:03:25 >>>> >>>> 2011-07-05 >>>> 11:03:25|9866|30:30|image|DataStructure.pm:_automethod(834)|data >>>> structure updated: >>>> $self->request_data->{reservation}{30}{imagerevision}{datecreated} >>>> >>>> |9866|30:30|image| imagerevision_date_created = 2011-07-05 11:03:25 >>>> >>>> 2011-07-05 11:03:25|9866|30:30|image|image.pm:process(161)|calling >>>> provisioning module's capture() subroutine 2011-07-05 >>>> 11:03:25|9866|30:30|image|xCAT2.pm:capture(776)|image=rh5image-power0107 >>>> 01 bi34-v0, computer=power01 >>>> >>>> 2011-07-05 11:03:25|9866|30:30|image|utils.pm:run_ssh_command(5380)| >>> >>> executing SSH command on power01: >>>> |9866|30:30|image| /usr/bin/ssh -i /etc/vcl/vcl.key -o >>>> |StrictHostKeyChecking=no -l root -p 22 -x power01 'chown root >>>> |currentimage.txt; chmod 777 currentimage.txt' 2>&1 >>>> >>>> 2011-07-05 11:03:25|9866|30:30|image|utils.pm:run_ssh_command(5464)| >>> >>> run_ssh_command output: >>>> |9866|30:30|image| Permission denied, please try again. >>>> |9866|30:30|image| Permission denied, please try again. >>>> |9866|30:30|image| Permission denied >>>> |(publickey,gssapi-with-mic,password). >>>> >>>> 2011-07-05 11:03:25|9866|30:30|image|utils.pm:run_ssh_command(5474)|SSH >>> >>> command executed on power01, command: >>>> |9866|30:30|image| /usr/bin/ssh -i /etc/vcl/vcl.key -o >>>> |StrictHostKeyChecking=no -l root -p 22 -x power01 'chown root >>>> |currentimage.txt; chmod 777 currentimage.txt' 2>&1 9866|30:30|image| >>>> |returning (255, "Permission denied, please try ...") >>>> >>>> 2011-07-05 >>>> 11:03:25|9866|30:30|image|utils.pm:write_currentimage_txt(5685)|updated >>>> ownership and permissions on currentimage.txt >>>> >>>> 2011-07-05 11:03:25|9866|30:30|image|utils.pm:run_ssh_command(5380)| >>> >>> executing SSH command on power01: >>>> |9866|30:30|image| /usr/bin/ssh -i /etc/vcl/vcl.key -o >>>> |StrictHostKeyChecking=no -l root -p 22 -x power01 'echo -e >>>> |"rh5image-power010701bi34-v0\r\nid=34\r\nprettyname=power010701bi\r\nim >>>> |ag erevision_id=34\r\nimagerevision_datecreated=2011-07-05 >>>> |11:03:25\r\ncomputer_id=2\r\ncomputer_hostname=power01.bluegrit.cs.umbc >>>> |.e du"> currentimage.txt&& cat currentimage.txt' 2>&1 >>>> >>>> 2011-07-05 11:03:25|9866|30:30|image|utils.pm:run_ssh_command(5464)| >>> >>> run_ssh_command output: >>>> |9866|30:30|image| Permission denied, please try again. >>>> |9866|30:30|image| Permission denied, please try again. >>>> |9866|30:30|image| Permission denied >>>> |(publickey,gssapi-with-mic,password). >>>> >>>> 2011-07-05 11:03:25|9866|30:30|image|utils.pm:run_ssh_command(5474)|SSH >>> >>> command executed on power01, command: >>>> |9866|30:30|image| /usr/bin/ssh -i /etc/vcl/vcl.key -o >>>> |StrictHostKeyChecking=no -l root -p 22 -x power01 'echo -e >>>> |"rh5image-power010701bi34-v0\r\nid=34\r\nprettyname=power010701bi\r\nim >>>> |ag erevision_id=34\r\nimagerevision_datecreated=2011-07-05 >>>> |11:03:25\r\ncomputer_id=2\r\ncomputer_hostname=power01.bluegrit.cs.umbc >>>> |.e du"> currentimage.txt&& cat currentimage.txt' 2>&1 >>>> |9866|30:30|image| returning (255, "Permission denied, please try ...") >>>> |9866|30:30|image| ---- WARNING ---- >>>> |9866|30:30|image| 2011-07-05 >>>> |11:03:25|9866|30:30|image|utils.pm:write_currentimage_txt(5699)|failed >>>> |to create currentimage.txt file on power01: 9866|30:30|image| >>>> |Permission denied, please try again. >>>> |9866|30:30|image| Permission denied, please try again. >>>> |9866|30:30|image| Permission denied >>>> |(publickey,gssapi-with-mic,password). 9866|30:30|image| ( 0) utils.pm, >>>> |write_currentimage_txt (line: 5699) 9866|30:30|image| (-1) xCAT2.pm, >>>> |capture (line: 779) >>>> |9866|30:30|image| (-2) image.pm, process (line: 162) >>>> |9866|30:30|image| (-3) vcld, make_new_child (line: 568) >>>> |9866|30:30|image| (-4) vcld, main (line: 346) >>>> |9866|30:30|image| ---- WARNING ---- >>>> |9866|30:30|image| 2011-07-05 >>>> |11:03:25|9866|30:30|image|xCAT2.pm:capture(783)|unable to update >>>> |currentimage.txt on power01 9866|30:30|image| ( 0) xCAT2.pm, capture >>>> |(line: 783) >>>> |9866|30:30|image| (-1) image.pm, process (line: 162) >>>> |9866|30:30|image| (-2) vcld, make_new_child (line: 568) >>>> |9866|30:30|image| (-3) vcld, main (line: 346) >>>> |9866|30:30|image| ---- WARNING ---- >>>> |9866|30:30|image| 2011-07-05 >>>> |11:03:25|9866|30:30|image|image.pm:process(166)|rh5image-power010701bi3 >>>> |4- v0 image failed to be captured by provisioning module >>>> |9866|30:30|image| ( 0) image.pm, process (line: 166) >>>> |9866|30:30|image| (-1) vcld, make_new_child (line: 568) >>>> |9866|30:30|image| (-2) vcld, main (line: 346) >>>> >>>> 2011-07-05 >>>> 11:03:25|9866|30:30|image|DataStructure.pm:get_computer_private_ip_addre >>>> ss (1581)|attempting to retrieve private IP address for computer: >>>> power01 2011-07-05 >>>> 11:03:25|9866|30:30|image|DataStructure.pm:get_computer_private_ip_addre >>>> ss (1585)|retrieved contents of /etc/hosts on this management node, >>>> contains 158 lines 2011-07-05 >>>> 11:03:25|9866|30:30|image|DataStructure.pm:get_computer_private_ip_addre >>>> ss (1645)|returning IP address from /etc/hosts file: 172.20.106.1 >>>> 2011-07-05 >>>> 11:03:25|9866|30:30|image|utils.pm:is_inblockrequest(6163)|zero rows >>>> were returned from database select 2011-07-05 >>>> 11:03:25|9866|30:30|image|DataStructure.pm:get_image_affiliation_name(20 >>>> 35 )|image owner id: 1 2011-07-05 >>>> 11:03:25|9866|30:30|image|utils.pm:getnewdbh(2709)|database requested >>>> (information_schema) does not match handle stored in $ENV{dbh} (vcl:) >>>> 2011-07-05 11:03:25|9866|30:30|image|utils.pm:getnewdbh(2760)|database >>>> handle stored in $ENV{dbh} 2011-07-05 >>>> 11:03:25|9866|30:30|image|DataStructure.pm:retrieve_user_data(1352)|atte >>>> mp ting to retrieve and store data for user: user.id = '1' 2011-07-05 >>>> 11:03:25|9866|30:30|image|utils.pm:getnewdbh(2709)|database requested >>>> (vcl) does not match handle stored in $ENV{dbh} (information_schema:) >>>> 2011-07-05 11:03:25|9866|30:30|image|utils.pm:getnewdbh(2760)|database >>>> handle stored in $ENV{dbh} 2011-07-05 >>>> 11:03:25|9866|30:30|image|DataStructure.pm:retrieve_user_data(1415)|data >>>> has been retrieved for user: admin (id: 1) >>>> >>>> On 6/24/11 10:13 AM, Josh Thompson wrote: >>>>> Sunil, >>>>> >>>>> "nodeset<nodename> image" sets up all the xCAT stuff so that the next >>>>> time the node is booted, it will boot the stateless/statelite image and >>>>> capture an image of the node. >>>>> >>>>> Can you double check that you have 'os' in the nodetype table set to >>>>> image for the node you are using? If you look in the partimageng.pm >>>>> xCAT module, you see toward the top where it registers the >>>>> "handled_commands". The "mk" gets stripped off. So, that module is >>>>> registering "install" and "image" for os type = "image". As long as >>>>> you have os in the nodetype table set to image, it should be using >>>>> that module. >>>>> >>>>> You will need to make sure you have all of the required files in >>>>> locations using 'ppc64' as the arch. >>>>> >>>>> Josh >>>>> >>>>> On Wednesday June 22, 2011, Sunil Venkatesh wrote: >>>>>> Hi, >>>>>> >>>>>> Update ! >>>>>> >>>>>> I was able to fix the problem that I was facing with the scripts by >>>>>> disabling the firewall. But, I still have a problem with the command- >>>>>> >>>>>> nodeset<nodename> image >>>>>> >>>>>> Unless this error is fixed, I don't think partimage will work. Am I >>>>>> right here? >>>>>> >>>>>> Thanks, >>>>>> Sunil >>>>>> >>>>>> On Tue, Jun 21, 2011 at 3:13 PM, Sunil Venkatesh<suni...@umbc.edu> >>> >>> wrote: >>>>>>> Josh, >>>>>>> >>>>>>> I have reached a point where I am able to boot the ppc using the >>>>>>> statelite images created using genimage. But, I was wondering how >>>>>>> significant the following command is. >>>>>>> >>>>>>> nodeset<nodename> image >>>>>>> >>>>>>> I got the same error that Prem had mentioned. >>>>>>> >>>>>>> >>>>>>> power01: Error: Unable to identify plugin for this command, check >>>>>>> relevant tables: nodetype.os >>>>>>> Error: Some nodes failed to set up image resources, aborting >>>>>>> >>>>>>> I tried changing the 'os' field to 'image' under nodetype, that >>>>>>> doesn't seem to help. I get the same error even after the change. >>>>>>> 'arch' in my case is set to 'ppc64'. >>>>>>> >>>>>>> >>>>>>> Also, I think partimage plugin needs to be changed to support the ppc >>>>>>> architecture, from what you had mentioned in the other thread. >>>>>>> >>>>>>> I am not sure what the command 'nodeset<nodename> image' does, but, >>>>>>> I am able to boot the statelite images by making changes to the >>>>>>> yaboot configuration files. The ppc blade currently uses LVM, that >>>>>>> needs to be replaced with ext2/ext3 from what I read from the other >>>>>>> thread, am I right? Also, just out of curiosity I left the statelite >>>>>>> image to boot with my current setting. I can see the xcat script >>>>>>> throwing an error- >>>>>>> >>>>>>> /opt/xcat/xcatdsklspost: line 229: /xcatpost/getpostscript.awk: No >>>>>>> such file or directory >>>>>>> /tmp/mypostscript: line 16: updateflag.awk: command not found >>>>>>> >>>>>>> both getpostscript.awk& updateflag.awk are not found in the rootimg >>>>>>> created by genimage. Is there any place I could find these scripts? >>>>>>> >>>>>>> Also, please correct me if there is anything wrong with the procedure >>>>>>> I am following. >>>>>>> >>>>>>> >>>>>>> Thanks in advance. >>>>>>> >>>>>>> Regards, >>>>>>> Sunil >>>>>>> >>>>>>> On 6/13/11 4:13 PM, Josh Thompson wrote: >>>>>>>> Sunil, >>>>>>>> >>>>>>>> From what I remember, I didn't have to do much to the rootimg.gz >>>>>>>> image to >>>>>>>> >>>>>>>> make >>>>>>>> it work. I created the files I supply before xCAT started using >>>>>>>> "statelite" >>>>>>>> instead of "stateless". I think statelite uses NFS to mount the >>>>>>>> image, and >>>>>>>> stateless uses an image file downloaded to the node and run out of >>>>>>>> RAM. >>>>>>>> >>>>>>>> Since >>>>>>>> >>>>>>>> generating a statelite image is pretty straightforward use of xCAT, >>>>>>>> you may >>>>>>>> want to ask on the xcat-user email list for help with it. >>>>>>>> >>>>>>>> Unless you can have the admins of the other dhcp server on your >>>>>>>> network exclude the MAC addresses of your blades, you'll need to >>>>>>>> create a separate private network to control your VCL stuff, either >>>>>>>> physically or with VLANs. >>>>>>>> >>>>>>>> If they can exclude the MACs, you can set up the dhcp server on your >>>>>>>> management node to only answer to requests from your blades. >>>>>>>> >>>>>>>> Josh >>>>>>>> >>>>>>>> On Monday June 13, 2011, Sunil Venkatesh wrote: >>>>>>>>> Josh, >>>>>>>>> >>>>>>>>> Again, Thank you for your valuable inputs. I have got to the point >>>>>>>>> where I can get the compute node to boot using the stateless >>>>>>>>> images. I had to manually configure the netboot since we already >>>>>>>>> had a DHCP server which is not the same as our Management node. >>>>>>>>> Since our setup is not in an isolated environment, I could not let >>>>>>>>> xcat handle the dhcp& netboot configuration (it messed up out >>>>>>>>> network >>>>>>>>> configuration when i let xcat handle it,we had 2 dhcp servers >>>>>>>>> running at that point). Are you aware of any way to let xcat handle >>>>>>>>> such scenarios? >>>>>>>>> >>>>>>>>> Although I am able to get the compute node to boot with the kernel >>>>>>>>> image& initrd, and NFS mount the rootimg that was generated >>>>>>>>> using 'genimage', I am getting the following error on the compute >>>>>>>>> node's console - >>>>>>>>> >>>>>>>>> FATAL error: could not get the entries from litefile >>>>>>>>> table... >>>>>>>>> >>>>>>>>> after going thru the init-scripts, I found out 'xCATCmd' binary is >>>>>>>>> not present in the rootimg. I am currently checking the xcat >>>>>>>>> packages for its availability. If you know the procedure to get it >>>>>>>>> onto the compute node, please let me know the same. >>>>>>>>> >>>>>>>>> Appreciate your support. >>>>>>>>> >>>>>>>>> Thanking you, >>>>>>>>> Sunil >>>>>>>>> >>>>>>>>> On 6/8/11 9:02 AM, Josh Thompson wrote: >>>>>>>>>> Sunil, >>>>>>>>>> >>>>>>>>>> I don't recall seeing any documentation on those parts. I had to >>>>>>>>>> poke around looking at parts of xCAT to see how it worked. It's >>>>>>>>>> been a few years since I did that; so, I don't remember much about >>>>>>>>>> the process. My recommendation would be to start looking at things >>>>>>>>>> in the rootimg.gz image. Looking at it now, I see that >>>>>>>>>> /opt/xcat/xcatdsklspost gets run when rootimg.gz boots. It looks >>>>>>>>>> like it downloads all of the postscripts from the management node >>>>>>>>>> and then run getpostscript.awk which issues a command to xcatd to >>>>>>>>>> get the primary postscript for that machine. I've forgotten how >>>>>>>>>> xcatd then builds the primary postscript. I do remember that in >>>>>>>>>> the partimageng.pm module, I had it add the partimageng >>>>>>>>>> postscript. >>>>>>>>>> >>>>>>>>>> So, you'll really have to start digging through how the xcat >>>>>>>>>> postscript system works. >>>>>>>>>> >>>>>>>>>> Josh >>>>>>>>>> >>>>>>>>>> On Tuesday June 07, 2011, Sunil Venkatesh wrote: >>>>>>>>>>> Josh, >>>>>>>>>>> >>>>>>>>>>> Is there any place I could find some details on >>>>>>>>>>> >>>>>>>>>>> "... /Once the compute node is booted with the stateless >>>>>>>>>>> image, it uses NFS to mount some things from the management node, >>>>>>>>>>> and then runs some xcat postscripts,/.... " >>>>>>>>>>> >>>>>>>>>>> I have the stateless images ready with partimage compiled for >>>>>>>>>>> PPC. For the compute node (power 7) to boot using the stateless >>>>>>>>>>> images, i need to >>>>>>>>>>> configure the yaboot instead of pxeboot (which is specific to >>>>>>>>>>> x86). I wanted to know where in the startup files the execution >>>>>>>>>>> of partimage and >>>>>>>>>>> NFS mount is configured. Is it configured by the "genimage" >>>>>>>>>>> command itself? Considering the way in which the nodes are >>>>>>>>>>> configured in the network, it would not be a good idea to let >>>>>>>>>>> xcat take care of configuring the details like DHCPD for >>>>>>>>>>> netboot. So, I need to make changes to the configuration files >>>>>>>>>>> manually, which is why this query came up. >>>>>>>>>>> >>>>>>>>>>> Thanks in advance. >>>>>>>>>>> >>>>>>>>>>> Regards, >>>>>>>>>>> Sunil >>>>>>>>>>> >>>>>>>>>>> On 6/1/11 1:39 PM, Josh Thompson wrote: >>>>>>>>>>>> Sunil, >>>>>>>>>>>> >>>>>>>>>>>> The "stateless" image I refer to is what is actually booted on >>>>>>>>>>>> the compute node containing the image to be captured. It's >>>>>>>>>>>> called stateless because it is loaded completely in RAM and >>>>>>>>>>>> does not maintain any state when a reboot occurs. >>>>>>>>>>>> >>>>>>>>>>>> The partimage binary is part of this stateless image and >>>>>>>>>>>> actually runs on the compute node. It does not run on the >>>>>>>>>>>> management node. The management node does not have block level >>>>>>>>>>>> access to the disk on the compute node to be able to capture >>>>>>>>>>>> the image from the disk. >>>>>>>>>>>> >>>>>>>>>>>> I'll try to describe the process a little better. The >>>>>>>>>>>> management node issues a reboot command to the compute node. >>>>>>>>>>>> The compute node uses PXE >>>>>>>>>>>> to load and boot a kernel (vmlinuz), initial RAM disk >>>>>>>>>>>> (initrd.img), and >>>>>>>>>>>> a root filesystem (rootimg.gz) from the management node. All >>>>>>>>>>>> three of these together make up the stateless image. Once the >>>>>>>>>>>> compute node is booted with the stateless image, it uses NFS to >>>>>>>>>>>> mount some things from the management node, and then runs some >>>>>>>>>>>> xcat >>>>>>>>>>>> postscripts, one of which is the partimageng postscript. This >>>>>>>>>>>> postscript determines what partitions are on the compute node >>>>>>>>>>>> and, depending on how the postscript >>>>>>>>>>>> is configured, uses partimage or partimageng to capture an image >>>>>>>>>>>> of the >>>>>>>>>>>> compute node disk that is then saved to the management node. >>>>>>>>>>>> When it is >>>>>>>>>>>> finished capturing the image, it notifies xcat on the management >>>>>>>>>>>> node and then reboots. xcat reconfigures itself to tell the >>>>>>>>>>>> compute node to >>>>>>>>>>>> boot off of disk at next boot. When the compute node comes up, >>>>>>>>>>>> it uses >>>>>>>>>>>> PXE to ask the management node how to boot. The management node >>>>>>>>>>>> tells it to boot off of disk. >>>>>>>>>>>> >>>>>>>>>>>> I hope that clarifies how the system works. If any of it is >>>>>>>>>>>> unclear, please ask for further clarification. >>>>>>>>>>>> >>>>>>>>>>>> Josh >>>>>>>>>>>> >>>>>>>>>>>> On Wednesday June 01, 2011, Sunil Venkatesh wrote: >>>>>>>>>>>>> Josh, >>>>>>>>>>>>> >>>>>>>>>>>>> I had one more clarification. >>>>>>>>>>>>> >>>>>>>>>>>>> partimage binaries run in the management node to capture an >>>>>>>>>>>>> (stateless) image from the compute node right? In that case, is >>>>>>>>>>>>> there a need for these binaries to go into the rootimg.gz?? >>>>>>>>>>>>> >>>>>>>>>>>>> My assumption is, partimage runs on the management node (an >>>>>>>>>>>>> intel blade in our case) to capture a stateless image from a >>>>>>>>>>>>> compute node (a power 7 blade) and stores these images under " >>>>>>>>>>>>> /install " of the management node. Please correct me if I am >>>>>>>>>>>>> wrong here. >>>>>>>>>>>>> >>>>>>>>>>>>> Regards, >>>>>>>>>>>>> Sunil >>>>>>>>>>>>> >>>>>>>>>>>>> On 6/1/11 9:58 AM, Josh Thompson wrote: >>>>>>>>>>>>>> -----BEGIN PGP SIGNED MESSAGE----- >>>>>>>>>>>>>> Hash: SHA1 >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Tuesday May 31, 2011, Sunil Venkatesh wrote: >>>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I used the steps that were mentioned under >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> https://cwiki.apache.org/**confluence/display/VCL/Adding+** >>>>>>>>>>>>>>> support+for+p<https://cwiki.apache.org/confluence/display/VCL >>>>>>>>>>>>>>> /A dd ing+support+for+p> ar ti mag e+and+partimage- >>>>>>>>>>>>>>> ng+to+xCAT+2.x+%28unofficial%**29 >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> to enable partimage support for xcat. I wasn't sure if I need >>>>>>>>>>>>>>> to change references to x86& x86_64 (as directories) to >>>>>>>>>>>>>>> reflect the >>>>>>>>>>>>>>> ppc architecture, as the web page says "The architecture for >>>>>>>>>>>>>>> the node must always be set to x86 for this..". I have with >>>>>>>>>>>>>>> me the vmlinuz (kernel image) and initrd for the capture >>>>>>>>>>>>>>> process. The 2 nodeset commands >>>>>>>>>>>>>> >>>>>>>>>>>>>> By this, do you mean you have vmlinuz and initrd for your >>>>>>>>>>>>>> power blades, not the ones linked to off of the page you >>>>>>>>>>>>>> listed above? If you do, that's a good start. However, >>>>>>>>>>>>>> you'll also need rootimg.gz. rootimg.gz is the root >>>>>>>>>>>>>> filesystem for the stateless image. It also contains the >>>>>>>>>>>>>> partimage and partimageng binaries. Assuming partimage or >>>>>>>>>>>>>> partimageng can actually capture partitions from power >>>>>>>>>>>>>> systems, you'll need to compile at least one of them to run >>>>>>>>>>>>>> on power. For the rootimg.gz image I provided, I compiled >>>>>>>>>>>>>> them statically so that I didn't have to worry about >>>>>>>>>>>>>> including any library dependencies in rootimg.gz. >>>>>>>>>>>>>> >>>>>>>>>>>>>> It would be a good idea to research how to use xcat's genimage >>>>>>>>>>>>>> command to generate stateless images to learn how to do this. >>>>>>>>>>>>>> >>>>>>>>>>>>>> If there's any part of the above that you don't fully >>>>>>>>>>>>>> understand, please ask me to clarify it. Until you have a >>>>>>>>>>>>>> stateless image that you can deploy to your power blades, >>>>>>>>>>>>>> there's no point in trying to debug any VCL specific items. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Josh >>>>>>>>>>>>>> - -- >>>>>>>>>>>>>> - ------------------------------**- >>>>>>>>>>>>>> Josh Thompson >>>>>>>>>>>>>> VCL Developer >>>>>>>>>>>>>> North Carolina State University >>>>>>>>>>>>>> >>>>>>>>>>>>>> my GPG/PGP key can be found at pgp.mit.edu >>>>>>>>>>>>>> -----BEGIN PGP SIGNATURE----- >>>>>>>>>>>>>> Version: GnuPG v2.0.17 (GNU/Linux) >>>>>>>>>>>>>> >>>>>>>>>>>>>> iEYEARECAAYFAk3mRYsACgkQV/**LQcNdtPQNnVgCbB9ZFJn0+C45RC/** >>>>>>>>>>>>>> g75RqGZY/j >>>>>>>>>>>>>> PZYAniP2Eam7nxgiDWUnp5sKPYPO4O**Ma >>>>>>>>>>>>>> =exBV >>>>>>>>>>>>>> -----END PGP SIGNATURE----- >>> >>> - -- >>> - ------------------------------- >>> Josh Thompson >>> VCL Developer >>> North Carolina State University >>> >>> my GPG/PGP key can be found at pgp.mit.edu >>> -----BEGIN PGP SIGNATURE----- >>> Version: GnuPG v2.0.17 (GNU/Linux) >>> >>> iEYEARECAAYFAk4VzQoACgkQV/LQcNdtPQM8YQCePg3O5vp5AXEhiO+5aIRIUO/S >>> 6IgAn1Xt4ytGnmxpfJVteCScFi0dRz15 >>> =Yls1 >>> -----END PGP SIGNATURE----- > - -- > - ------------------------------- > Josh Thompson > VCL Developer > North Carolina State University > > my GPG/PGP key can be found at pgp.mit.edu > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v2.0.17 (GNU/Linux) > > iEYEARECAAYFAk4V9PEACgkQV/LQcNdtPQM6ZgCfaPLJh9MuEVLqRYdHNLqC8BzQ > JOsAn35U1e4V+xuxFPajb2rVVcg4gril > =CWDr > -----END PGP SIGNATURE-----