Josh,

Could you provide me with the links to the resources of the VCL workshop? If 
there is a way to witness the workshop while it is in progress, that would help 
too. 

Regards,
Sunil


On Jul 7, 2011, at 2:03 PM, Josh Thompson wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Sunil,
> 
> On Thursday July 07, 2011, Sunil Venkatesh wrote:
>> Thanks Josh. My professor was asking about the details of VCL workshop
>> in NC. Are you aware of these details?
> 
> The workshop is hosted by NCSU.  It takes people from an introduction to VCL 
> to actually installing and managing it.  It is already full, but I think 
> recordings of the sessions may be available when it is over.
> 
>> 
>> Please bare with my comments inline.
> 
> Responses also inline.
> 
>> On 7/7/11 11:13 AM, Josh Thompson wrote:
>>> -----BEGIN PGP SIGNED MESSAGE-----
>>> Hash: SHA1
>>> 
>>> On Tuesday July 05, 2011, Sunil Venkatesh wrote:
>>>> Hi Josh,
>>>> 
>>>> I was able to get the following things done in respect to getting VCL to
>>>> work on POWER.
>>>> 
>>>> 1. Made modifications in the xcat tables to get the capture process
>>>> working with statelite images instead of stateless images. Particularly
>>>> the noderes&  bootparams table.
>>>> 
>>>> 2. Used partimage to capture the images (did NOT set usepartimageng to
>>>> 1).
>>>> 
>>>> -rw-r--r-- 1 root root    0 Jul  5 16:38 compute.img.capturedone
>>>> -rw-r--r-- 1 root root    0 Jul  5 15:58 compute.img.capturefailed
>>>> -rw------- 1 root root 6.5M Jul  5 16:07 compute-parta2.gz
>>>> -rw------- 1 root root 679M Jul  5 16:10 compute-parta3.gz
>>>> -rw------- 1 root root  23M Jul  5 16:38 compute-parta6.gz
>>>> -rw-r--r-- 1 root root  512 Jul  5 16:07 compute-sda.mbr
>>>> -rw-r--r-- 1 root root  363 Jul  5 16:07 compute-sda.sfdisk
>>>> 
>>>> 
>>>> 2 partitions including the boot partition present on the blade were
>>>> captured under /install/image/ppc64/. Initially, RHEL 5 was installed on
>>>> a 600 GB partition due to which the capture process failed. The image of
>>>> the partition was generated once the partition size was reduced to 6GB.
>>>> Is it necessary for me to use partimage-ng instead of partimage itself?
>>> 
>>> Are you asking if you need to use partimage-ng for partitions that are
>>> 600GB? If so, I don't really know.  We've never dealt with partitions
>>> that large.
>> 
>> Here, I am just asking if images captured using partimage are recognized
>> by VCL or is it required that I use partimage-ng. From your earlier
>> emails to Prem, I could notice that the only difference between
>> partimage & partimage-ng (after setting userpartimageng to 1) is the
>> former generates images with .gz and the later generates .img. Am I
>> right here? Also, I was able to get the 600GB partition captured, since
>> the partition was empty, it resulted in a ~17MB image file.
> 
> VCL can deploy images captured with both partimage and partimage-ng.  At 
> NCSU, 
> we were going to switch to partimage-ng, which is why I added in support for 
> it, but then we realized we'd have to upgrade all of our management nodes to 
> xCAT2 at the same time or some of them wouldn't be able to deploy newly 
> captured images that were captured with partimage-ng (the support for xCAT1.x 
> can't deploy using partimage-ng).  So, we just stuck with partimage.  The 
> captured file format between the two is different.
> 
>>>> When proceeding further with "vcld --setup", the script was not able to
>>>> find the images that were created using partimage. The options that are
>>>> provided in the script does not allow for selecting an architecture
>>>> other than x86/x86_64.
>>> 
>>> You'll need to modify the vcld image.pm module.  Look in
>>> /usr/local/vcl/lib/VCL.  In image.pm, look for the function
>>> 'setup_capture_base_image'; then, find 'my @architecture_choices' and add
>>> 'ppc' as another option.
>> 
>> As a matter of fact, I tried this step. But, the
>> _get_image_repository_path function in
>> /usr/local/vcl/lib/VCL/Module/Provisioning/xCAT.pm does not recognize
>> the architecture when I choose ppc/ppc64 in the menu. On line 2922 in
>> the same file, image_architecture is set to undefined. I think the list
>> of supported architectures is stored in some mysql table. I haven't
>> checked regarding this, i was trying to get VCL to recognize the images
>> as x86/x86_64 by setting up soft links in the search paths of VCL.
> 
> This and your next question are both deeper into the backend code that I've 
> worked with.  Andy or Aaron may be able to answer your questions further.
> 
> Josh
> 
>>>> Also, in the error log vcld is looking for
>>>> 
>>>> /opt/xcat/share/xcat/install/image/rh5image-power010701bi34-v0.tmpl
>>>> 
>>>> and cannot find the template file. Should the template file that needs
>>>> to be accessed in this case be createimage.ppc64.tmpl?
>>> 
>>> This is actually a check to make sure the image doesn't already exist
>>> before trying to capturing it.  So, it is good that it doesn't find it.
>> 
>> If possible, could you please provide me with the details of steps that
>> take place here. If there are any documentation available regarding
>> this, that would work too. U said "image doesn't already exist before
>> trying to capturing it", how does VCL capture the images? does it make
>> use of the images that are already generated using partimage? if so, in
>> what places does it look for the images?
>> 
>> Sorry for asking too many questions. I could trace the scripts to check
>> the flow, but, that would take a lot of time. You have been really
>> patient with all my queries, appreciate that.
>> 
>> Thanks
>> Sunil
>> 
>>> It sounds like you're almost there.  Great work!
>>> 
>>> Josh
>>> 
>>>> I have attached a log at the end of the mail. I am not sure where I have
>>>> gone wrong with the VCL configuration.
>>>> 
>>>> -Sunil
>>>> 
>>>> -----
>>>> 
>>>> rh5image-power010701bi34-v0 image creation failed
>>>> ------------------------------------------------------------------------
>>>> time: 2011-07-05 11:03:25
>>>> caller: image.pm:reservation_failed(385)
>>>> ( 0) image.pm, reservation_failed (line: 385)
>>>> (-1) image.pm, process (line: 167)
>>>> (-2) vcld, make_new_child (line: 568)
>>>> (-3) vcld, main (line: 346)
>>>> ------------------------------------------------------------------------
>>>> management node: web1.bluegrit.cs.umbc.edu
>>>> reservation PID: 9866
>>>> parent vcld PID: 19110
>>>> 
>>>> request ID: 30
>>>> reservation ID: 30
>>>> request state/laststate: image/image
>>>> request start time: 2011-07-05 11:03:20
>>>> request end time: 2011-07-05 12:03:20
>>>> for imaging: no
>>>> log ID: none
>>>> 
>>>> computer: power01.bluegrit.cs.umbc.edu
>>>> computer id: 2
>>>> computer type: blade
>>>> computer eth0 MAC address:<undefined>
>>>> computer eth1 MAC address:<undefined>
>>>> computer private IP address: 172.20.106.1
>>>> computer public IP address: 172.20.106.1
>>>> computer in block allocation: no
>>>> provisioning module: VCL::Module::Provisioning::xCAT2
>>>> 
>>>> image: rh5image-power010701bi34-v0
>>>> image display name: power010701bi
>>>> image ID: 34
>>>> image revision ID: 34
>>>> image size: 1450 MB
>>>> use Sysprep: yes
>>>> root access: yes
>>>> image owner ID: 1
>>>> image owner affiliation: Local
>>>> image revision date created: 2011-07-05 11:03:25
>>>> image revision production: yes
>>>> OS module: VCL::Module::OS::Linux
>>>> 
>>>> user: admin
>>>> user name: vcl admin
>>>> user ID: 1
>>>> user affiliation: Local
>>>> ------------------------------------------------------------------------
>>>> RECENT LOG ENTRIES FOR THIS PROCESS:
>>>> 2011-07-05
>>>> 
>>>> 11:03:25|9866|30:30|image|Module.pm:create_os_object(304)|
> VCL::Module::OS:
>>>> :Linux OS object created for rh5image-power010701bi34-v0, address:
>>>> :88fb070
>>>> 
>>>> 2011-07-05 11:03:25|9866|30:30|image|xCAT.pm:initialize(110)|XCATROOT
>>>> environment variable is not set, using /opt/xcat 2011-07-05
>>>> 11:03:25|9866|30:30|image|xCAT.pm:initialize(128)|xCAT root path found:
>>>> /opt/xcat 2011-07-05
>>>> 11:03:25|9866|30:30|image|xCAT.pm:initialize(130)|xCAT module
>>>> initialized 2011-07-05
>>>> 11:03:25|9866|30:30|image|xCAT2.pm:initialize(110)|XCATROOT environment
>>>> variable is not set, using /opt/xcat 2011-07-05
>>>> 11:03:25|9866|30:30|image|xCAT2.pm:initialize(128)|xCAT root path found:
>>>> /opt/xcat 2011-07-05
>>>> 11:03:25|9866|30:30|image|xCAT2.pm:initialize(130)|xCAT module
>>>> initialized 2011-07-05
>>>> 11:03:25|9866|30:30|image|Module.pm:create_provisioning_object(420)|VCL:
>>>> :M odule::Provisioning::xCAT2 module loaded 2011-07-05
>>>> 11:03:25|9866|30:30|image|Module.pm:create_mn_os_object(335)|management
>>>> node OS object has already been created, address: 88f23b0, returning 1
>>>> 2011-07-05
>>>> 11:03:25|9866|30:30|image|Module.pm:new(200)|VCL::Module::Provisioning::
>>>> xC AT2 object created for computer power01, address: 88fb0e0 2011-07-05
>>>> 11:03:25|9866|30:30|image|xCAT2.pm:initialize(110)|XCATROOT environment
>>>> variable is not set, using /opt/xcat 2011-07-05
>>>> 11:03:25|9866|30:30|image|xCAT2.pm:initialize(128)|xCAT root path found:
>>>> /opt/xcat 2011-07-05
>>>> 11:03:25|9866|30:30|image|xCAT2.pm:initialize(130)|xCAT module
>>>> initialized 2011-07-05
>>>> 11:03:25|9866|30:30|image|Module.pm:create_provisioning_object(426)|VCL:
>>>> :M odule::Provisioning::xCAT2 provisioner object created for power01,
>>>> address: 88fb0e0 2011-07-05
>>>> 11:03:25|9866|30:30|image|State.pm:initialize(126)|returning 1
>>>> 2011-07-05
>>>> 11:03:25|9866|30:30|image|vcld:make_new_child(565)|VCL::image object
>>>> created and initialized 2011-07-05
>>>> 11:03:25|9866|30:30|image|utils.pm:mail(1268)|SUCCESS -- Sending mail
>>>> To:shru...@gmail.com, VCL IMAGE Creation Started:
>>>> rh5image-power010701bi34-v0 2011-07-05
>>>> 11:03:25|9866|30:30|image|xCAT.pm:does_image_exist(2434)|image OS
>>>> install type: partimage 2011-07-05
>>>> 11:03:25|9866|30:30|image|xCAT.pm:_get_image_repository_path(2910)|manag
>>>> em ent node identifier argument was not specified
>>>> 
>>>> 2011-07-05 11:03:25|9866|30:30|image|
>>> 
>>> xCAT.pm:_get_image_repository_path(2932)|attempting to determine
>>> repository
>>> 
>>> path for image on web1.bluegrit.cs.umbc.edu:
>>>> |9866|30:30|image| image id: 34
>>>> |9866|30:30|image| OS name: rh5image
>>>> |9866|30:30|image| OS type: linux
>>>> |9866|30:30|image| OS install type: partimage
>>>> |9866|30:30|image| OS source path: image
>>>> |9866|30:30|image| architecture: x86_64
>>>> 
>>>> 2011-07-05
>>>> 11:03:25|9866|30:30|image|xCAT.pm:_get_image_repository_path(2996)|did
>>>> not find any images under /tftpboot/xcat//linux_image/x86_64 on
>>>> web1.bluegrit.cs.umbc.edu 2011-07-05
>>>> 11:03:25|9866|30:30|image|xCAT.pm:_get_image_repository_path(3006)|retur
>>>> ni ng repository path for web1.bluegrit.cs.umbc.edu:
>>>> /tftpboot/xcat//image/x86_64 2011-07-05
>>>> 11:03:25|9866|30:30|image|xCAT.pm:does_image_exist(2444)|image
>>>> repository path: /tftpboot/xcat//image/x86_64
>>>> 
>>>> 2011-07-05 11:03:25|9866|30:30|image|utils.pm:run_command(9010)|executed
>>> 
>>> command: du -c /tftpboot/xcat//image/x86_64/*rh5image-power010701bi34-v0*
>>> 2>&1
>>> 
>>> | grep total 2>&1, pid: 9877, exit status: 0, output:
>>>> |9866|30:30|image| 0 total
>>>> 
>>>> 2011-07-05
>>>> 11:03:25|9866|30:30|image|xCAT.pm:does_image_exist(2506)|image does NOT
>>>> exist: rh5image-power010701bi34-v0 2011-07-05
>>>> 11:03:25|9866|30:30|image|xCAT2.pm:_get_image_template_path(2084)|manage
>>>> me nt node identifier argument was not specified
>>>> 
>>>> 2011-07-05 11:03:25|9866|30:30|image|
>>> 
>>> xCAT2.pm:_get_image_template_path(2115)|attempting to determine template
>>> path
>>> 
>>> for image:
>>>> |9866|30:30|image| image name: rh5image-power010701bi34-v0
>>>> |9866|30:30|image| OS install type: partimage
>>>> |9866|30:30|image| OS source path: image
>>>> |9866|30:30|image| xCAT 2.x OS source path: image
>>>> 
>>>> 2011-07-05
>>>> 11:03:25|9866|30:30|image|xCAT2.pm:_get_image_template_path(2123)|return
>>>> in g: /opt/xcat/share/xcat/install/image 2011-07-05
>>>> 11:03:25|9866|30:30|image|xCAT.pm:does_image_exist(2518)|template
>>>> repository path for rh5image-power010701bi34-v0:
>>>> /opt/xcat/share/xcat/install/image 2011-07-05
>>>> 11:03:25|9866|30:30|image|xCAT.pm:does_image_exist(2530)|template file
>>>> does not exist:
>>>> /opt/xcat/share/xcat/install/image/rh5image-power010701bi34-v0.tmpl
>>>> 2011-07-05
>>>> 11:03:25|9866|30:30|image|xCAT.pm:does_image_exist(2570)|image
>>>> rh5image-power010701bi34-v0 does NOT exist on this management node
>>>> 2011-07-05 11:03:25|9866|30:30|image|image.pm:process(145)|image
>>>> rh5image-power010701bi34-v0 does not exist in the repository 2011-07-05
>>>> 11:03:25|9866|30:30|image|DataStructure.pm:_automethod(834)|data
>>>> structure updated:
>>>> $self->request_data->{reservation}{30}{image}{lastupdate}
>>>> 
>>>> |9866|30:30|image| image_lastupdate = 2011-07-05 11:03:25
>>>> 
>>>> 2011-07-05
>>>> 11:03:25|9866|30:30|image|DataStructure.pm:_automethod(834)|data
>>>> structure updated:
>>>> $self->request_data->{reservation}{30}{imagerevision}{datecreated}
>>>> 
>>>> |9866|30:30|image| imagerevision_date_created = 2011-07-05 11:03:25
>>>> 
>>>> 2011-07-05 11:03:25|9866|30:30|image|image.pm:process(161)|calling
>>>> provisioning module's capture() subroutine 2011-07-05
>>>> 11:03:25|9866|30:30|image|xCAT2.pm:capture(776)|image=rh5image-power0107
>>>> 01 bi34-v0, computer=power01
>>>> 
>>>> 2011-07-05 11:03:25|9866|30:30|image|utils.pm:run_ssh_command(5380)|
>>> 
>>> executing SSH command on power01:
>>>> |9866|30:30|image| /usr/bin/ssh -i /etc/vcl/vcl.key  -o
>>>> |StrictHostKeyChecking=no -l root -p 22 -x power01 'chown root
>>>> |currentimage.txt; chmod 777 currentimage.txt' 2>&1
>>>> 
>>>> 2011-07-05 11:03:25|9866|30:30|image|utils.pm:run_ssh_command(5464)|
>>> 
>>> run_ssh_command output:
>>>> |9866|30:30|image| Permission denied, please try again.
>>>> |9866|30:30|image| Permission denied, please try again.
>>>> |9866|30:30|image| Permission denied
>>>> |(publickey,gssapi-with-mic,password).
>>>> 
>>>> 2011-07-05 11:03:25|9866|30:30|image|utils.pm:run_ssh_command(5474)|SSH
>>> 
>>> command executed on power01, command:
>>>> |9866|30:30|image| /usr/bin/ssh -i /etc/vcl/vcl.key  -o
>>>> |StrictHostKeyChecking=no -l root -p 22 -x power01 'chown root
>>>> |currentimage.txt; chmod 777 currentimage.txt' 2>&1 9866|30:30|image|
>>>> |returning (255, "Permission denied, please try ...")
>>>> 
>>>> 2011-07-05
>>>> 11:03:25|9866|30:30|image|utils.pm:write_currentimage_txt(5685)|updated
>>>> ownership and permissions on currentimage.txt
>>>> 
>>>> 2011-07-05 11:03:25|9866|30:30|image|utils.pm:run_ssh_command(5380)|
>>> 
>>> executing SSH command on power01:
>>>> |9866|30:30|image| /usr/bin/ssh -i /etc/vcl/vcl.key  -o
>>>> |StrictHostKeyChecking=no -l root -p 22 -x power01 'echo -e
>>>> |"rh5image-power010701bi34-v0\r\nid=34\r\nprettyname=power010701bi\r\nim
>>>> |ag erevision_id=34\r\nimagerevision_datecreated=2011-07-05
>>>> |11:03:25\r\ncomputer_id=2\r\ncomputer_hostname=power01.bluegrit.cs.umbc
>>>> |.e du">   currentimage.txt&&   cat currentimage.txt' 2>&1
>>>> 
>>>> 2011-07-05 11:03:25|9866|30:30|image|utils.pm:run_ssh_command(5464)|
>>> 
>>> run_ssh_command output:
>>>> |9866|30:30|image| Permission denied, please try again.
>>>> |9866|30:30|image| Permission denied, please try again.
>>>> |9866|30:30|image| Permission denied
>>>> |(publickey,gssapi-with-mic,password).
>>>> 
>>>> 2011-07-05 11:03:25|9866|30:30|image|utils.pm:run_ssh_command(5474)|SSH
>>> 
>>> command executed on power01, command:
>>>> |9866|30:30|image| /usr/bin/ssh -i /etc/vcl/vcl.key  -o
>>>> |StrictHostKeyChecking=no -l root -p 22 -x power01 'echo -e
>>>> |"rh5image-power010701bi34-v0\r\nid=34\r\nprettyname=power010701bi\r\nim
>>>> |ag erevision_id=34\r\nimagerevision_datecreated=2011-07-05
>>>> |11:03:25\r\ncomputer_id=2\r\ncomputer_hostname=power01.bluegrit.cs.umbc
>>>> |.e du">   currentimage.txt&&   cat currentimage.txt' 2>&1
>>>> |9866|30:30|image| returning (255, "Permission denied, please try ...")
>>>> |9866|30:30|image| ---- WARNING ----
>>>> |9866|30:30|image| 2011-07-05
>>>> |11:03:25|9866|30:30|image|utils.pm:write_currentimage_txt(5699)|failed
>>>> |to create currentimage.txt file on power01: 9866|30:30|image|
>>>> |Permission denied, please try again.
>>>> |9866|30:30|image| Permission denied, please try again.
>>>> |9866|30:30|image| Permission denied
>>>> |(publickey,gssapi-with-mic,password). 9866|30:30|image| ( 0) utils.pm,
>>>> |write_currentimage_txt (line: 5699) 9866|30:30|image| (-1) xCAT2.pm,
>>>> |capture (line: 779)
>>>> |9866|30:30|image| (-2) image.pm, process (line: 162)
>>>> |9866|30:30|image| (-3) vcld, make_new_child (line: 568)
>>>> |9866|30:30|image| (-4) vcld, main (line: 346)
>>>> |9866|30:30|image| ---- WARNING ----
>>>> |9866|30:30|image| 2011-07-05
>>>> |11:03:25|9866|30:30|image|xCAT2.pm:capture(783)|unable to update
>>>> |currentimage.txt on power01 9866|30:30|image| ( 0) xCAT2.pm, capture
>>>> |(line: 783)
>>>> |9866|30:30|image| (-1) image.pm, process (line: 162)
>>>> |9866|30:30|image| (-2) vcld, make_new_child (line: 568)
>>>> |9866|30:30|image| (-3) vcld, main (line: 346)
>>>> |9866|30:30|image| ---- WARNING ----
>>>> |9866|30:30|image| 2011-07-05
>>>> |11:03:25|9866|30:30|image|image.pm:process(166)|rh5image-power010701bi3
>>>> |4- v0 image failed to be captured by provisioning module
>>>> |9866|30:30|image| ( 0) image.pm, process (line: 166)
>>>> |9866|30:30|image| (-1) vcld, make_new_child (line: 568)
>>>> |9866|30:30|image| (-2) vcld, main (line: 346)
>>>> 
>>>> 2011-07-05
>>>> 11:03:25|9866|30:30|image|DataStructure.pm:get_computer_private_ip_addre
>>>> ss (1581)|attempting to retrieve private IP address for computer:
>>>> power01 2011-07-05
>>>> 11:03:25|9866|30:30|image|DataStructure.pm:get_computer_private_ip_addre
>>>> ss (1585)|retrieved contents of /etc/hosts on this management node,
>>>> contains 158 lines 2011-07-05
>>>> 11:03:25|9866|30:30|image|DataStructure.pm:get_computer_private_ip_addre
>>>> ss (1645)|returning IP address from /etc/hosts file: 172.20.106.1
>>>> 2011-07-05
>>>> 11:03:25|9866|30:30|image|utils.pm:is_inblockrequest(6163)|zero rows
>>>> were returned from database select 2011-07-05
>>>> 11:03:25|9866|30:30|image|DataStructure.pm:get_image_affiliation_name(20
>>>> 35 )|image owner id: 1 2011-07-05
>>>> 11:03:25|9866|30:30|image|utils.pm:getnewdbh(2709)|database requested
>>>> (information_schema) does not match handle stored in $ENV{dbh} (vcl:)
>>>> 2011-07-05 11:03:25|9866|30:30|image|utils.pm:getnewdbh(2760)|database
>>>> handle stored in $ENV{dbh} 2011-07-05
>>>> 11:03:25|9866|30:30|image|DataStructure.pm:retrieve_user_data(1352)|atte
>>>> mp ting to retrieve and store data for user: user.id = '1' 2011-07-05
>>>> 11:03:25|9866|30:30|image|utils.pm:getnewdbh(2709)|database requested
>>>> (vcl) does not match handle stored in $ENV{dbh} (information_schema:)
>>>> 2011-07-05 11:03:25|9866|30:30|image|utils.pm:getnewdbh(2760)|database
>>>> handle stored in $ENV{dbh} 2011-07-05
>>>> 11:03:25|9866|30:30|image|DataStructure.pm:retrieve_user_data(1415)|data
>>>> has been retrieved for user: admin (id: 1)
>>>> 
>>>> On 6/24/11 10:13 AM, Josh Thompson wrote:
>>>>> Sunil,
>>>>> 
>>>>> "nodeset<nodename>   image" sets up all the xCAT stuff so that the next
>>>>> time the node is booted, it will boot the stateless/statelite image and
>>>>> capture an image of the node.
>>>>> 
>>>>> Can you double check that you have 'os' in the nodetype table set to
>>>>> image for the node you are using?  If you look in the partimageng.pm
>>>>> xCAT module, you see toward the top where it registers the
>>>>> "handled_commands".  The "mk" gets stripped off.  So, that module is
>>>>> registering "install" and "image" for os type = "image".  As long as
>>>>> you have os in the nodetype table set to image, it should be using
>>>>> that module.
>>>>> 
>>>>> You will need to make sure you have all of the required files in
>>>>> locations using 'ppc64' as the arch.
>>>>> 
>>>>> Josh
>>>>> 
>>>>> On Wednesday June 22, 2011, Sunil Venkatesh wrote:
>>>>>> Hi,
>>>>>> 
>>>>>> Update !
>>>>>> 
>>>>>> I was able to fix the problem that I was facing with the scripts by
>>>>>> disabling the firewall. But, I still have a problem with the command-
>>>>>> 
>>>>>> nodeset<nodename>   image
>>>>>> 
>>>>>> Unless this error is fixed, I don't think partimage will work. Am I
>>>>>> right here?
>>>>>> 
>>>>>> Thanks,
>>>>>> Sunil
>>>>>> 
>>>>>> On Tue, Jun 21, 2011 at 3:13 PM, Sunil Venkatesh<suni...@umbc.edu>
>>> 
>>> wrote:
>>>>>>> Josh,
>>>>>>> 
>>>>>>> I have reached a point where I am able to boot the ppc using the
>>>>>>> statelite images created using genimage. But, I was wondering how
>>>>>>> significant the following command is.
>>>>>>> 
>>>>>>> nodeset<nodename>   image
>>>>>>> 
>>>>>>> I got the same error that Prem had mentioned.
>>>>>>> 
>>>>>>> 
>>>>>>> power01: Error: Unable to identify plugin for this command, check
>>>>>>> relevant tables: nodetype.os
>>>>>>> Error: Some nodes failed to set up image resources, aborting
>>>>>>> 
>>>>>>> I tried changing the 'os' field to 'image' under nodetype, that
>>>>>>> doesn't seem to help. I get the same error even after the change.
>>>>>>> 'arch' in my case is set to 'ppc64'.
>>>>>>> 
>>>>>>> 
>>>>>>> Also, I think partimage plugin needs to be changed to support the ppc
>>>>>>> architecture, from what you had mentioned in the other thread.
>>>>>>> 
>>>>>>> I am not sure what the command 'nodeset<nodename>   image' does, but,
>>>>>>> I am able to boot the statelite images by making changes to the
>>>>>>> yaboot configuration files. The ppc blade currently uses LVM, that
>>>>>>> needs to be replaced with ext2/ext3 from what I read from the other
>>>>>>> thread, am I right? Also, just out of curiosity I left the statelite
>>>>>>> image to boot with my current setting. I can see the xcat script
>>>>>>> throwing an error-
>>>>>>> 
>>>>>>> /opt/xcat/xcatdsklspost: line 229: /xcatpost/getpostscript.awk: No
>>>>>>> such file or directory
>>>>>>> /tmp/mypostscript: line 16: updateflag.awk: command not found
>>>>>>> 
>>>>>>> both getpostscript.awk&   updateflag.awk are not found in the rootimg
>>>>>>> created by genimage. Is there any place I could find these scripts?
>>>>>>> 
>>>>>>> Also, please correct me if there is anything wrong with the procedure
>>>>>>> I am following.
>>>>>>> 
>>>>>>> 
>>>>>>> Thanks in advance.
>>>>>>> 
>>>>>>> Regards,
>>>>>>> Sunil
>>>>>>> 
>>>>>>> On 6/13/11 4:13 PM, Josh Thompson wrote:
>>>>>>>> Sunil,
>>>>>>>> 
>>>>>>>>   From what I remember, I didn't have to do much to the rootimg.gz
>>>>>>>>   image to
>>>>>>>> 
>>>>>>>> make
>>>>>>>> it work.  I created the files I supply before xCAT started using
>>>>>>>> "statelite"
>>>>>>>> instead of "stateless".  I think statelite uses NFS to mount the
>>>>>>>> image, and
>>>>>>>> stateless uses an image file downloaded to the node and run out of
>>>>>>>> RAM.
>>>>>>>> 
>>>>>>>>   Since
>>>>>>>> 
>>>>>>>> generating a statelite image is pretty straightforward use of xCAT,
>>>>>>>> you may
>>>>>>>> want to ask on the xcat-user email list for help with it.
>>>>>>>> 
>>>>>>>> Unless you can have the admins of the other dhcp server on your
>>>>>>>> network exclude the MAC addresses of your blades, you'll need to
>>>>>>>> create a separate private network to control your VCL stuff, either
>>>>>>>> physically or with VLANs.
>>>>>>>> 
>>>>>>>> If they can exclude the MACs, you can set up the dhcp server on your
>>>>>>>> management node to only answer to requests from your blades.
>>>>>>>> 
>>>>>>>> Josh
>>>>>>>> 
>>>>>>>> On Monday June 13, 2011, Sunil Venkatesh wrote:
>>>>>>>>> Josh,
>>>>>>>>> 
>>>>>>>>> Again, Thank you for your valuable inputs. I have got to the point
>>>>>>>>> where I can get the compute node to boot using the stateless
>>>>>>>>> images. I had to manually configure the netboot since we already
>>>>>>>>> had a DHCP server which is not the same as our Management node.
>>>>>>>>> Since our setup is not in an isolated environment, I could not let
>>>>>>>>> xcat handle the dhcp&    netboot configuration (it messed up out
>>>>>>>>> network
>>>>>>>>> configuration when i let xcat handle it,we had 2 dhcp servers
>>>>>>>>> running at that point). Are you aware of any way to let xcat handle
>>>>>>>>> such scenarios?
>>>>>>>>> 
>>>>>>>>> Although I am able to get the compute node to boot with the kernel
>>>>>>>>> image&    initrd, and NFS mount the rootimg that was generated
>>>>>>>>> using 'genimage', I am getting the following error on the compute
>>>>>>>>> node's console -
>>>>>>>>> 
>>>>>>>>>       FATAL error: could not get the entries from litefile
>>>>>>>>>       table...
>>>>>>>>> 
>>>>>>>>> after going thru the init-scripts, I found out 'xCATCmd' binary is
>>>>>>>>> not present in the rootimg. I am currently checking the xcat
>>>>>>>>> packages for its availability. If you know the procedure to get it
>>>>>>>>> onto the compute node, please let me know the same.
>>>>>>>>> 
>>>>>>>>> Appreciate your support.
>>>>>>>>> 
>>>>>>>>> Thanking you,
>>>>>>>>> Sunil
>>>>>>>>> 
>>>>>>>>> On 6/8/11 9:02 AM, Josh Thompson wrote:
>>>>>>>>>> Sunil,
>>>>>>>>>> 
>>>>>>>>>> I don't recall seeing any documentation on those parts.  I had to
>>>>>>>>>> poke around looking at parts of xCAT to see how it worked.  It's
>>>>>>>>>> been a few years since I did that; so, I don't remember much about
>>>>>>>>>> the process. My recommendation would be to start looking at things
>>>>>>>>>> in the rootimg.gz image.  Looking at it now, I see that
>>>>>>>>>> /opt/xcat/xcatdsklspost gets run when rootimg.gz boots.  It looks
>>>>>>>>>> like it downloads all of the postscripts from the management node
>>>>>>>>>> and then run getpostscript.awk which issues a command to xcatd to
>>>>>>>>>> get the primary postscript for that machine.  I've forgotten how
>>>>>>>>>> xcatd then builds the primary postscript. I do remember that in
>>>>>>>>>> the partimageng.pm module, I had it add the partimageng
>>>>>>>>>> postscript.
>>>>>>>>>> 
>>>>>>>>>> So, you'll really have to start digging through how the xcat
>>>>>>>>>> postscript system works.
>>>>>>>>>> 
>>>>>>>>>> Josh
>>>>>>>>>> 
>>>>>>>>>> On Tuesday June 07, 2011, Sunil Venkatesh wrote:
>>>>>>>>>>> Josh,
>>>>>>>>>>> 
>>>>>>>>>>> Is there any place I could find some details on
>>>>>>>>>>> 
>>>>>>>>>>> "... /Once the compute node is booted with the stateless
>>>>>>>>>>> image, it uses NFS to mount some things from the management node,
>>>>>>>>>>> and then runs some xcat postscripts,/.... "
>>>>>>>>>>> 
>>>>>>>>>>> I have the stateless images ready with partimage compiled for
>>>>>>>>>>> PPC. For the compute node (power 7) to boot using the stateless
>>>>>>>>>>> images, i need to
>>>>>>>>>>> configure the yaboot instead of pxeboot (which is specific to
>>>>>>>>>>> x86). I wanted to know where in the startup files the execution
>>>>>>>>>>> of partimage and
>>>>>>>>>>> NFS mount is configured. Is it configured by the "genimage"
>>>>>>>>>>> command itself? Considering the way in which the nodes are
>>>>>>>>>>> configured in the network, it would not be a good idea to let
>>>>>>>>>>> xcat take care of configuring the details like DHCPD for
>>>>>>>>>>> netboot. So, I need to make changes to the configuration files
>>>>>>>>>>> manually, which is why this query came up.
>>>>>>>>>>> 
>>>>>>>>>>> Thanks in advance.
>>>>>>>>>>> 
>>>>>>>>>>> Regards,
>>>>>>>>>>> Sunil
>>>>>>>>>>> 
>>>>>>>>>>> On 6/1/11 1:39 PM, Josh Thompson wrote:
>>>>>>>>>>>> Sunil,
>>>>>>>>>>>> 
>>>>>>>>>>>> The "stateless" image I refer to is what is actually booted on
>>>>>>>>>>>> the compute node containing the image to be captured.  It's
>>>>>>>>>>>> called stateless because it is loaded completely in RAM and
>>>>>>>>>>>> does not maintain any state when a reboot occurs.
>>>>>>>>>>>> 
>>>>>>>>>>>> The partimage binary is part of this stateless image and
>>>>>>>>>>>> actually runs on the compute node.  It does not run on the
>>>>>>>>>>>> management node. The management node does not have block level
>>>>>>>>>>>> access to the disk on the compute node to be able to capture
>>>>>>>>>>>> the image from the disk.
>>>>>>>>>>>> 
>>>>>>>>>>>> I'll try to describe the process a little better.  The
>>>>>>>>>>>> management node issues a reboot command to the compute node. 
>>>>>>>>>>>> The compute node uses PXE
>>>>>>>>>>>> to load and boot a kernel (vmlinuz), initial RAM disk
>>>>>>>>>>>> (initrd.img), and
>>>>>>>>>>>> a root filesystem (rootimg.gz) from the management node.  All
>>>>>>>>>>>> three of these together make up the stateless image.  Once the
>>>>>>>>>>>> compute node is booted with the stateless image, it uses NFS to
>>>>>>>>>>>> mount some things from the management node, and then runs some
>>>>>>>>>>>> xcat
>>>>>>>>>>>> postscripts, one of which is the partimageng postscript.  This
>>>>>>>>>>>> postscript determines what partitions are on the compute node
>>>>>>>>>>>> and, depending on how the postscript
>>>>>>>>>>>> is configured, uses partimage or partimageng to capture an image
>>>>>>>>>>>> of the
>>>>>>>>>>>> compute node disk that is then saved to the management node.
>>>>>>>>>>>> When it is
>>>>>>>>>>>> finished capturing the image, it notifies xcat on the management
>>>>>>>>>>>> node and then reboots.  xcat reconfigures itself to tell the
>>>>>>>>>>>> compute node to
>>>>>>>>>>>> boot off of disk at next boot.  When the compute node comes up,
>>>>>>>>>>>> it uses
>>>>>>>>>>>> PXE to ask the management node how to boot.  The management node
>>>>>>>>>>>> tells it to boot off of disk.
>>>>>>>>>>>> 
>>>>>>>>>>>> I hope that clarifies how the system works.  If any of it is
>>>>>>>>>>>> unclear, please ask for further clarification.
>>>>>>>>>>>> 
>>>>>>>>>>>> Josh
>>>>>>>>>>>> 
>>>>>>>>>>>> On Wednesday June 01, 2011, Sunil Venkatesh wrote:
>>>>>>>>>>>>> Josh,
>>>>>>>>>>>>> 
>>>>>>>>>>>>> I had one more clarification.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> partimage binaries run in the management node to capture an
>>>>>>>>>>>>> (stateless) image from the compute node right? In that case, is
>>>>>>>>>>>>> there a need for these binaries to go into the rootimg.gz??
>>>>>>>>>>>>> 
>>>>>>>>>>>>> My assumption is, partimage runs on the management node (an
>>>>>>>>>>>>> intel blade in our case) to capture a stateless image from a
>>>>>>>>>>>>> compute node (a power 7 blade) and stores these images under "
>>>>>>>>>>>>> /install " of the management node. Please correct me if I am
>>>>>>>>>>>>> wrong here.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>> Sunil
>>>>>>>>>>>>> 
>>>>>>>>>>>>> On 6/1/11 9:58 AM, Josh Thompson wrote:
>>>>>>>>>>>>>> -----BEGIN PGP SIGNED MESSAGE-----
>>>>>>>>>>>>>> Hash: SHA1
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> On Tuesday May 31, 2011, Sunil Venkatesh wrote:
>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> I used the steps that were mentioned under
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> https://cwiki.apache.org/**confluence/display/VCL/Adding+**
>>>>>>>>>>>>>>> support+for+p<https://cwiki.apache.org/confluence/display/VCL
>>>>>>>>>>>>>>> /A dd ing+support+for+p>   ar ti mag e+and+partimage-
>>>>>>>>>>>>>>> ng+to+xCAT+2.x+%28unofficial%**29
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> to enable partimage support for xcat. I wasn't sure if I need
>>>>>>>>>>>>>>> to change references to x86&       x86_64 (as directories) to
>>>>>>>>>>>>>>> reflect the
>>>>>>>>>>>>>>> ppc architecture, as the web page says "The architecture for
>>>>>>>>>>>>>>> the node must always be set to x86 for this..". I have with
>>>>>>>>>>>>>>> me the vmlinuz (kernel image) and initrd for the capture
>>>>>>>>>>>>>>> process. The 2 nodeset commands
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> By this, do you mean you have vmlinuz and initrd for your
>>>>>>>>>>>>>> power blades, not the ones linked to off of the page you
>>>>>>>>>>>>>> listed above? If you do, that's a good start.  However,
>>>>>>>>>>>>>> you'll also need rootimg.gz. rootimg.gz is the root
>>>>>>>>>>>>>> filesystem for the stateless image.  It also contains the
>>>>>>>>>>>>>> partimage and partimageng binaries. Assuming partimage or
>>>>>>>>>>>>>> partimageng can actually capture partitions from power
>>>>>>>>>>>>>> systems, you'll need to compile at least one of them to run
>>>>>>>>>>>>>> on power.  For the rootimg.gz image I provided, I compiled
>>>>>>>>>>>>>> them statically so that I didn't have to worry about
>>>>>>>>>>>>>> including any library dependencies in rootimg.gz.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> It would be a good idea to research how to use xcat's genimage
>>>>>>>>>>>>>> command to generate stateless images to learn how to do this.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> If there's any part of the above that you don't fully
>>>>>>>>>>>>>> understand, please ask me to clarify it.  Until you have a
>>>>>>>>>>>>>> stateless image that you can deploy to your power blades,
>>>>>>>>>>>>>> there's no point in trying to debug any VCL specific items.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Josh
>>>>>>>>>>>>>> - --
>>>>>>>>>>>>>> - ------------------------------**-
>>>>>>>>>>>>>> Josh Thompson
>>>>>>>>>>>>>> VCL Developer
>>>>>>>>>>>>>> North Carolina State University
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> my GPG/PGP key can be found at pgp.mit.edu
>>>>>>>>>>>>>> -----BEGIN PGP SIGNATURE-----
>>>>>>>>>>>>>> Version: GnuPG v2.0.17 (GNU/Linux)
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> iEYEARECAAYFAk3mRYsACgkQV/**LQcNdtPQNnVgCbB9ZFJn0+C45RC/**
>>>>>>>>>>>>>> g75RqGZY/j
>>>>>>>>>>>>>> PZYAniP2Eam7nxgiDWUnp5sKPYPO4O**Ma
>>>>>>>>>>>>>> =exBV
>>>>>>>>>>>>>> -----END PGP SIGNATURE-----
>>> 
>>> - --
>>> - -------------------------------
>>> Josh Thompson
>>> VCL Developer
>>> North Carolina State University
>>> 
>>> my GPG/PGP key can be found at pgp.mit.edu
>>> -----BEGIN PGP SIGNATURE-----
>>> Version: GnuPG v2.0.17 (GNU/Linux)
>>> 
>>> iEYEARECAAYFAk4VzQoACgkQV/LQcNdtPQM8YQCePg3O5vp5AXEhiO+5aIRIUO/S
>>> 6IgAn1Xt4ytGnmxpfJVteCScFi0dRz15
>>> =Yls1
>>> -----END PGP SIGNATURE-----
> - -- 
> - -------------------------------
> Josh Thompson
> VCL Developer
> North Carolina State University
> 
> my GPG/PGP key can be found at pgp.mit.edu
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v2.0.17 (GNU/Linux)
> 
> iEYEARECAAYFAk4V9PEACgkQV/LQcNdtPQM6ZgCfaPLJh9MuEVLqRYdHNLqC8BzQ
> JOsAn35U1e4V+xuxFPajb2rVVcg4gril
> =CWDr
> -----END PGP SIGNATURE-----

Reply via email to