I had a couple of awkward problems to solve when getting a zencommand plugin to 
work so I thought I would share. If my analysis is correct this information 
might also be useful to others and could possibly be used to enhance the 
documentation on creating zencommand plugins.

So please take a look and see if this is useful.

J.

== Tip when working with zencommands. ==
 
If ONE data point in a whole template has an error (can have multiple graphs, 
data sources and points) then the data does not 
get stored in rrdtool. Even though testing with "zencommand run" reports a 
"storing" message. 
It might be safer to define datasources and graphs in a few seperate templates 
instead of one template.
In that way an error appearing for one piece of data would not cause all data 
retrieve to stop.

I'm still finding it a little awkward how best to manage classes and attributes 
in zenoss.

== Large data values received had a problem being stored if set to AVERAGE, use 
COUNTER instead ==

Value returned was FramesReceived=74609722 but rrdtool tried to insert 
74609722.0.

<pre>
2008-04-17 12:36:12 ERROR zen.RRDUtil: rrd error not a simple integer: 
'74609722.0' Devices/frigg.ie.commprove.test/iubTGenStatusTxt_FramesReceived
</pre>

Fix is easy, use COUNTER instead of AVERAGE for those large data types.

(It could happen that a value monitored for a long time could grow and suddenly 
cause this kind of problem is a surprising way.)

== zencommand has some problems scaling. ==

... if you quickly hack together a plugin! :)
 
"ERROR zen.zencommand: [Errno 24] Too many open files"

I see per-process open file limits of 254 and 258 on solaris 10 boxes.

zencommand fires off multiple processes ( number of servers * data sources ) to 
retrieve data.

One may edit zencommand.py and reduce MAX_CONNECTIONS
gsed -i s/MAX_CONNECTIONS=256/MAX_CONNECTIONS=16/ 
$ZENHOME/Products/ZenRRD/zencommand.py 

Problems were reduced but not eliminated. 
There were 15 servers in list and 3 zencommand datasources 15 * 3 = 45 which is 
not too many.

If you implement a zencommand plugin that causes zencommand to use up file 
handles you can run into this issue.
I have a generic plugin which is called with different parameters to feed 
several data sources.
The performance template was applied to a list of solaris servers.

The first incarnation of the plugin is a script which calls nagios plugin 
check_http and also uses 
a temporary file. I figure plugin is too slow/heavy; bash is spawned and 
another process to call check_http.
Essentially best practice for plugins is to keep them very light and do any 
messing/work on the server side.

We wish to add more servers and more data sources using plugins like this.
So it must scale.
There is a limitation on what resources plugins are allowed to use which may 
not be very obvious.




-------------------- m2f --------------------

Read this topic online here:
http://community.zenoss.com/forums/viewtopic.php?p=18670#18670

-------------------- m2f --------------------



_______________________________________________
zenoss-users mailing list
[email protected]
http://lists.zenoss.org/mailman/listinfo/zenoss-users

Reply via email to