Could you try to run the updatenodestat on your service node which is very
slow to response the updateflag.awk?

Steps:
   cd /opt/xcat/bin
   ln -s ../bin/xcatclient updatenodestat
   time updatenodestat <node> booted

To see how long it needs to finish it on service node.

Thanks
Best Regards
----------------------------------------------------------------------
 Wang Xiaopeng (王晓朋)
 IBM China System Technology Laboratory
 Tel: 86-10-82453455
 Email: w...@cn.ibm.com
 Address: 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road,
Haidian District Beijing P.R.China 100193



From:   Russell Jones <russell-l...@jonesmail.me>
To:     xcat-user@lists.sourceforge.net,
Date:   2014/03/13 22:40
Subject:        Re: [xcat-user] Additional performance issues



xCat Service node specs:

CPU - 2 x Dual-Core AMD Opteron(tm) Processor 2212
Memory - 6 gigs  (2 x 1gig and 2 x 512mb per CPU)
Diskful - Shared /install, shared /tftpboot
Useflowcontrol is disabled.  xCat master was 2.8.2 then upgraded to 2.8.3
xCat DB contains 7851 nodes in nodelist table.


I also did a few timing tests with running updateflag.awk from a compute
node to each servicenode and master.

One service node take 3 times longer than the other 2 service nodes.

Using the xcat Master (cmx04-hc) is 3 times faster than using any service
node.

[root@c103n69 xcatpost]# time ./updateflag.awk servicefarm03-hc 3002
"installstatus booted"

real    0m10.175s
user    0m0.000s
sys     0m0.002s

[root@c103n69 xcatpost]# time ./updateflag.awk servicefarm02-hc 3002
"installstatus booted"

real    0m3.679s
user    0m0.001s
sys     0m0.001s

[root@c103n69 xcatpost]# time ./updateflag.awk servicefarm01-hc 3002
"installstatus booted"

real    0m3.653s
user    0m0.002s
sys     0m0.000s

[root@c103n69 xcatpost]# time ./updateflag.awk master 3002 "installstatus
booted"

real    0m0.491s
user    0m0.000s
sys     0m0.002s



All of the test were done with the service and master nodes were idle.  For
all the service nodes the CPU usage for the Install Monitor process jumped
up to 99% for the entire duration.

This did not seem to happen on the master node (but then again it was done
in under a second)

There are also instances where the Install Monitor process totally stops
responding on a servicenode.  This caused the updateflag.awk command to
hang indefinitely on the compute nodes.

Issuing a “service xcatd reload” appears to restart the Install Monitor and
it starts to respond again.. However at this point the updateflag.awk
process had to be killed on the compute nodes which were hung.

I have looked into enabling the useflowcontrol option since the Docs say it
is enabled by default for new 2.8.3 installs. (Ours was upgraded from
2.8.2, thus was not enabled)

No errors are present in any logs that are pointing to an issue.

Once useflowcontrol is enabled in the site table, will xcatd have to be
restarted/reloaded on the master and service nodes?



On 3/11/2014 7:32 AM, Lissa Valletta wrote:


      This is happening with only 50 nodes?    What size memory and how
      many CPU's do you have on the service nodes.  Are the service nodes
      diskfull installed. What is the setting for site attribute
      useflowcontrol,  if it is set at all.    We have many systems
      installing a  lot more nodes than 50 without this problem
      What is the OS and xCAT level on the servicenode and Management Node.

      Can you monitor /var/log/messages on the Management Node during this
      to see if you are seeing errors from xcatd on the service node.


      Lissa K. Valletta
      8-3/B10
      Poughkeepsie, NY 12601
      (tie 293) 433-3102



      Inactive hide details for Russell Jones ---03/10/2014
      05:25:55 PM---Hi Wang, As a followup to this, is there any
      additional perRussell Jones ---03/10/2014 05:25:55 PM---Hi Wang, As a
      followup to this, is there any additional performance tips or

      From: Russell Jones <russell-l...@jonesmail.me>
      To: xcat-user@lists.sourceforge.net,
      Date: 03/10/2014 05:25 PM
      Subject: Re: [xcat-user] Additional performance issues





      Hi Wang,

      As a followup to this, is there any additional performance tips or
      things we can do to assist with this performance issue? I don't think
      that the problem we are seeing is caused by the "getpostscripts"
      request, as it doesn't seem like getpostscript.awk connects to port
      3002, the port that the "Install Monitor" uses. As a reminder, the
      problem we are seeing is the "Install Monitor" process on the service
      node being pegged at 100% CPU, and nodes hanging before showing their
      login console, *after* the postscripts have already finished. When
      nodes finally start showing their login console the "install monitor"
      service starts lowering its CPU usage.

      After disabling the site.nodestatus feature we seem to have resolved
      the performance issue. However we would like to use this feature if
      we can, as long as it doesn't cause such a huge drop in service node
      performance.

      Thanks!


      On 3/2/2014 8:59 PM, Xiao Peng Wang wrote:

            I cannot see any impact except the status of node won't be
            update when setting 'site.nodestatus=n', you can try to
            leverage it to solve your issue.

            But I am thinking maybe your issue caused by the running of
            'getpostscripts' when booting the diskless nodes. You can try
            to set the site.precreatemypostscripts=1 to improve the
            performance of getmypostscript operation.

            I suggest you to try them one by one and then both to see the
            result.


            Thanks
            Best Regards
            
----------------------------------------------------------------------

            Wang Xiaopeng (王晓朋)
            IBM China System Technology Laboratory
            Tel: 86-10-82453455
            Email: w...@cn.ibm.com
            Address: 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West
            Road, Haidian District Beijing P.R.China 100193

            Inactive hide details for Russell Jones ---2014/03/02
            13:18:43---Hi all, We are seeing pretty consistently that when
            50+ diskleRussell Jones ---2014/03/02 13:18:43---Hi all, We are
            seeing pretty consistently that when 50+ diskless nodes are

            From: Russell Jones <russell-l...@jonesmail.me>
            To: xcat-user@lists.sourceforge.net,
            Date: 2014/03/02 13:18
            Subject: [xcat-user] Additional performance issues


            Hi all,

            We are seeing pretty consistently that when 50+ diskless nodes
            are
            booted against the same single service node, before showing
            their login
            console they all hang for around 5-10 minutes after postscripts
            run
            while the service node is chugging away at using a constant
            100% of a
            single core. The process on the service node that is at fault
            is the
            "install monitor".

            This seems to be tied to the site.nodestatus option. Other than

            disabling this option (and fixing the diskfull reinstall loop
            bug with
            it that I reported earlier), is there a way of improving the
            responsiveness of a service node when it's updating the node's
            status?
            Would we lose anything from disabling this option besides the
            "status"
            column not being updated?


            Thanks!

            
------------------------------------------------------------------------------

            Flow-based real-time traffic analytics software. Cisco
            certified tool.
            Monitor traffic, SLAs, QoS, Medianet, WAAS etc. with NetFlow
            Analyzer
            Customize your own dashboards, set traffic alerts and generate
            reports.
            Network behavioral analysis & security monitoring. All-in-one
            tool.
            
http://pubads.g.doubleclick.net/gampad/clk?id=126839071&iu=/4140/ostg.clktrk

            _______________________________________________
            xCAT-user mailing list
            xCAT-user@lists.sourceforge.net
            https://lists.sourceforge.net/lists/listinfo/xcat-user




            
------------------------------------------------------------------------------

            Subversion Kills Productivity. Get off Subversion & Make the
            Move to Perforce.
            With Perforce, you get hassle-free workflows. Merge that
            actually works.
            Faster operations. Version large binaries.  Built-in WAN
            optimization and the
            freedom to use Git, Perforce or both. Make the move to
            Perforce.
            
http://pubads.g.doubleclick.net/gampad/clk?id=122218951&iu=/4140/ostg.clktrk



            _______________________________________________
            xCAT-user mailing list
            xCAT-user@lists.sourceforge.net
            https://lists.sourceforge.net/lists/listinfo/xcat-user
      
------------------------------------------------------------------------------

      Learn Graph Databases - Download FREE O'Reilly Book
      "Graph Databases" is the definitive new guide to graph databases and
      their
      applications. Written by three acclaimed leaders in the field,
      this first edition is now available. Download your free book today!
      http://p.sf.net/sfu/13534_NeoTech
      _______________________________________________
      xCAT-user mailing list
      xCAT-user@lists.sourceforge.net
      https://lists.sourceforge.net/lists/listinfo/xcat-user



      
------------------------------------------------------------------------------

      Learn Graph Databases - Download FREE O'Reilly Book
      "Graph Databases" is the definitive new guide to graph databases and
      their
      applications. Written by three acclaimed leaders in the field,
      this first edition is now available. Download your free book today!
      http://p.sf.net/sfu/13534_NeoTech


      _______________________________________________
      xCAT-user mailing list
      xCAT-user@lists.sourceforge.net
      https://lists.sourceforge.net/lists/listinfo/xcat-user

------------------------------------------------------------------------------

Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech
_______________________________________________
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user

<<inline: graycol.gif>>

------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech
_______________________________________________
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user

Reply via email to