Hello Roberto,
Have you configured NumCPU attribute of the LDom resource to 12? If yes, then that is the reason you see a change in the number of virtual CPUs attached to the LDom when you online or offline the resource. The LDom agent issues ldm set-vcpu <NumCPU> <ldom> in the online entry point and removes them in the offline/clean entry point. This will be done only if the NumCPU attribute is set to a non-zero value. There are a couple of decisions that a user can make based on this behavior. When the value is set to non-zero, VCS ensures that the offline node does not reserve the CPU capacity for a particular LDom. We can use those VCPUs to bring some other LDom on that node. However, in a failover situation depending on the VCPU availability the LDom resource might come up with fewer VCPUs or fail to come up at all. This can be configured as an intelligent load balancing technique. If you set the NumCPU attribute to 0 (default), then even on the offline node VCPUs for that particular LDom will be reserved and no other LDoms can use it. This will ensure that every time LDom failover occurs, that LDom will be guaranteed to get VCPUs originally attached by the user. Hope this helps. Vikas. ----- Original Message ---- From: "Ballan, Roberto" <[email protected]> To: ldoms-discuss at opensolaris.org Sent: Monday, April 21, 2008 10:30:19 AM Subject: [ldoms-discuss] LDOM/VCS issue/question Please any suggestion/help will be great. I have implemented LDOM/VCS on Solaris 10 (T2000 Servers). I am having some issues during my TEST FAILOVER LDOM/VCS. Please see the below details and let me know if you have any suggestion about this topic. ==================================================================== gaxgpvw65xu# ldm list NAME STATE FLAGS CONS VCPU MEMORY UTIL UPTIME primary active -n-cv SP 20 20G 0.1% 22m ldg1 inactive ----- 5 4G ldg2 inactive ----- 8 11G gaxgpvw65xu# ================================================================ gaxgpvw65xu# hastatus -sum -- SYSTEM STATE -- System State Frozen A gaxgpvw64xu RUNNING 0 A gaxgpvw65xu RUNNING 0 -- GROUP STATE -- Group System Probed AutoDisabled State B ClusterService gaxgpvw64xu Y N ONLINE B ClusterService gaxgpvw65xu Y N OFFLINE B SG-ldg1 gaxgpvw64xu Y N OFFLINE B SG-ldg1 gaxgpvw65xu Y N ONLINE gaxgpvw65xu# ======================================================================== ==== The SG-ldg1 is online on gaxgpvw65xu now, ldg1 has 12 vcpu and 4 G assigned, I am not sure if it is correct it since before (see above info) the resource were different. I have used the VCS GUI to bring on line the SG-ldg1 on gaxgpvw65xu. gaxgpvw65xu# ldm list NAME STATE FLAGS CONS VCPU MEMORY UTIL UPTIME primary active -n-cv SP 20 20G 1.5% 34m ldg1 active -n--- 5000 12 4G 5.5% 1m ldg2 inactive ----- 8 11G gaxgpvw65xu# ====================================================================== At this point SG-ldg1 is ONLINE on 65 and OFFLINE on 64. ========================================================= Now I run: gaxgpvw65xu# hagrp -switch SG-ldg1 -to gaxgpvw64xu gaxgpvw65xu# hastatus -sum -- SYSTEM STATE -- System State Frozen A gaxgpvw64xu RUNNING 0 A gaxgpvw65xu RUNNING 0 -- GROUP STATE -- Group System Probed AutoDisabled State B ClusterService gaxgpvw64xu Y N ONLINE B ClusterService gaxgpvw65xu Y N OFFLINE B SG-ldg1 gaxgpvw64xu Y N ONLINE B SG-ldg1 gaxgpvw65xu Y N OFFLINE gaxgpvw65xu# ======================================================================== === Now the SG-ldg1 is ONLINE on 64 and OFFLINE on 65. ======================================================================== == Now I try to run the viceversa from gaxgpvw64xu to gaxgpvw65xu (where is the issue, in my opinion): gaxgpvw64xu# hagrp -switch SG-ldg1 -to gaxgpvw65xu gaxgpvw65xu# hastatus -sum -- SYSTEM STATE -- System State Frozen A gaxgpvw64xu RUNNING 0 A gaxgpvw65xu RUNNING 0 -- GROUP STATE -- Group System Probed AutoDisabled State B ClusterService gaxgpvw64xu Y N ONLINE B ClusterService gaxgpvw65xu Y N OFFLINE B SG-ldg1 gaxgpvw64xu Y N OFFLINE B SG-ldg1 gaxgpvw65xu Y N ONLINE gaxgpvw65xu# ============================================================= Now for some reasons is working fine, in my opinion, The SG-ldg1 is ONLINE on 65 and OFFLINE on 64. ----------------------------------------------------- Now I run: gaxgpvw65xu# hagrp -switch SG-ldg1 -to gaxgpvw64xu gaxgpvw65xu# hastatus -sum -- SYSTEM STATE -- System State Frozen A gaxgpvw64xu RUNNING 0 A gaxgpvw65xu RUNNING 0 -- GROUP STATE -- Group System Probed AutoDisabled State B ClusterService gaxgpvw64xu Y N ONLINE B ClusterService gaxgpvw65xu Y N OFFLINE B SG-ldg1 gaxgpvw64xu Y N ONLINE B SG-ldg1 gaxgpvw65xu Y N OFFLINE gaxgpvw65xu# ======================================================================= Now SG-ldg1 is ONLINE on gaxgpvw64xu but I am not able to telnet localhost 5000. gaxgpvw64xu# telnet localhost 5000 Trying 127.0.0.1... Connected to localhost. Escape character is '^]'. ON GAXGPVW65xu: Now ldg1 shows "0 VCPU", after I have killed the telnet session that was hanging on gaxgpvx64xu. I am trying to understand what is going on here, I am not sure.... ================================================================= gaxgpvw65xu# ldm list NAME STATE FLAGS CONS VCPU MEMORY UTIL UPTIME primary active -n-cv SP 20 20G 0.3% 1h 7m ldg1 inactive ----- 0 4G ldg2 inactive ----- 8 11G gaxgpvw65xu# It sounds like that VCS does not assign it the right resources. ================================================================= Thanks. Best Regards. Roberto Ballan e-mail: x2ballan at southernco.com -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/ldoms-discuss/attachments/20080422/8f2147bc/attachment.html>
